[erlang-questions] Strange performance degradation in dict when it's storing lists

Sverker Eriksson sverker.eriksson@REDACTED
Mon Aug 22 15:53:45 CEST 2016


Looks like you measure the latency of send + dict:find + receive
that sometimes happens to get preemted by very heavy calls
to lists:delete.

/Sverker


On 08/22/2016 10:35 AM, Park, Sungjin wrote:
> I observed a strange performance degradation in dict.  Let me share the
> code I used in the test first.
>
>
>      -module(data).
>      -export([start_link/1, get/1, get_concurrent/1]).
>      -export([init/0]).
>
>      start_link() ->
>        proc_lib:start_link(?MODULE, init, []).
>
>      init() ->
>        register(?MODULE, self()),
>        % Initialize data:
>        % 0 => [],
>        % 1 => [1],
>        % 2 => [1,2]
>        % ...
>        Dict = lists:foldl(
>          fun (Key, Dict0) -> dict:store(Key, value(Key), Dict0) end,
>          dict:new(), lists:seq(0, 255)
>        ),
>        proc_lib:init_ack({ok, self()}),
>        loop(Dict).
>
>      value(Key) ->
>          lists:seq(1, Key).
>
>      loop(Dict) ->
>        receive
>          {get, Key, From} ->
>            case dict:find(Key, Dict) of
>              {ok, Value} -> From ! Value;
>              error -> From ! undefined
>            end;
>          _ ->
>            ok
>        end,
>        loop(Dict).
>
>      get(Key) ->
>        ?MODULE ! {get, Key, self()},
>        receive
>          Value -> Value
>        end.
>
>      %% Run get N times and return average execution time.
>      -spec get_concurrent(integer()) -> number().
>      get_concurrent(N) ->
>        Profiler = self(),
>        Workers = [
>           prof_lib:spawn_link(
>             fun () ->
>               Key = erlang:system_time() rem 255,
>               Result = timer:tc(?MODULE, get, [Key]),
>               Profiler ! {self(), Result}
>             end
>           ) || _ <- lists:seq(1, N)
>         ],
>         Ts = receive_all(Workers, []),
>         lists:sum(Ts) / length(Ts).
>
>      receive_all([], Ts) ->
>        Ts;
>      receive_all(Workers, Ts) ->
>        receive
>          {Worker, {T, _}} -> receive_all(lists:delete(Worker, Workers), [T |
> Ts])
>        end.
>
>
> When I ran the test in the shell, I got.
>
>      1> data:start_link().
>      {ok, <0.6497.46>}
>      2> timer:tc(data, get, [5]).
>      {23,[1,2,3,4,5]}
>
>
> I could get a value in 23 microseconds and expected something not too
> slower results for concurrent get but,
>
>      3> data:get_concurrent(100000).
>      19442.828
>
>
> The value 19442.828 microseconds seemed to be too big a value so I tested
> with different values such as large binaries and tuples.  And this time the
> same get_concurrent(100000) gave me 200 something microseconds.
>
> I also tried the same with an ets instead of a dict, but there was no such
> performance degradation by the value type.
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160822/310a7688/attachment.htm>


More information about the erlang-questions mailing list