phash2 isn't very resistant to collisions?
Sverker Eriksson
sverker.eriksson@REDACTED
Wed Dec 15 17:34:09 CET 2021
To get 5 more bits:
phash2(Term, 1 bsl 32)
You get less collisions, but it still does not get close to a uniform distribution for your example.
Test(500) -> 249993
/Sverker
________________________________
From: erlang-questions <erlang-questions-bounces@REDACTED> on behalf of Roger Lipscombe <roger@REDACTED>
Sent: Wednesday, December 15, 2021 12:57 PM
To: erlang-questions <erlang-questions@REDACTED>
Subject: phash2 isn't very resistant to collisions?
I'm sure this has come up previously, but all I could find talked about atoms.
I know it's not intended to be cryptographically strong, but
erlang:phash2 is woefully collidey with 2-tuples:
Test = fun(N) ->
L = [{X, Y} || X <- lists:seq(0, N - 1), Y <- lists:seq(0, N - 1)],
Expected = N * N,
Expected = length(L),
Expected = length(lists:usort([erlang:phash2(Term) || Term <- L]))
end.
1> Test(10).
100
2> Test(100).
** exception error: no match of right hand side value 9999
3> Test(99).
** exception error: no match of right hand side value 9800
4> Test(500).
** exception error: no match of right hand side value 249763
What alternatives are there to phash2, if I still care about performance?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20211215/2f47bacd/attachment.htm>
More information about the erlang-questions
mailing list