[erlang-questions] Unstable erlang compared to java or perl
Morten Krogh
mk@REDACTED
Sun Nov 7 22:24:15 CET 2010
Hi
Are you sure that your ets memory information is in bytes, and not in words?
Morten.
On 11/7/10 10:15 PM, Petter Egesund wrote:
> Hi, and thanks for answering!
>
> Yes, the memory should be reclaimed - I think the problem might be
> that the garbage collector is failing?!
>
> Summing up ets-memory-usage for each thread gives reasonable numbers,
> but erlang:memory() shows ets is using much more ram - calling
> ets:delete before the process finishes does not help.
>
> No, I am only keeping integers in the ets, no strings :-) The program
> is using binaries a lot, but these seems to be reclaimed from the gb
> without problems.
>
> Yes, I know the difference between processes/threads, only a couple
> small ets-tables are global and shared between processes. The large
> ones, which I think are the cause of the problem, is only written
> to/read by one process.
>
> Full source code? It is unfortunately to long and to data-bound to
> make sense. Still hoping for a clue - ets:i() tells me that total ets
> should be less than to 1 gb total, but memory() says it is using more
> than 5 gb for ets.
>
> Cheers,
>
> Petter
>
>
>
>
>
> On Sun, Nov 7, 2010 at 9:37 PM, Ryan Zezeski<rzezeski@REDACTED> wrote:
>>
>> On Sun, Nov 7, 2010 at 9:49 AM, Petter Egesund<petter.egesund@REDACTED>
>> wrote:
>>> Hi, I have a small program with lots of memory-updates which I try to
>>> run in Erlang.
>>>
>>> The same algorithm works fine in both Java and Perl, but fails in
>>> Erlang because the program runs out of memory - and I can not figure
>>> out why. Frustrating, as my Erlang-versjon seems to be the easiest to
>>> scale as well as being the most readable.
>>>
>>> The program is threaded and each thread writes to a ets-table which is
>>> created at the beginning of the thread. When the thread dies I try to
>>> do a ets:delete(Table), like described in the manual, but the memory
>>> used by the thread never seems to be released.
>>>
>>> Some facts:
>>>
>>> - The memory usage of each thread is rather constant. This is
>>> confirmed when I use ets:i() to show info about memory usage.
>>> - The number of threads are constant - confirmed by both running top
>>> and writing out the number of threads regularly. When a thread dies, I
>>> create a new one.
>>> - I have tried to end the thread by sending a exit-signal as the last
>>> statement. This helps some, but does not solve the leak.
>>> - I put small lists of size 3-4 integers into the ets as values, the
>>> keys are list of same size as well.
>>> - I garbage-collect each thread before it dies, as well as doing
>>> regular global garbage-collects. No help.
>>> - Information from ets:i() about memory when I sum usage by each
>>> thread, is much lower than stated by memory() when i run
>>> erlang:memory(). This might indicate something? Does not seem logical
>>> to me, at least.
>>> - Info from erlang:memory is about half of what top/the os tells.
>>> - I am running on ubuntu, 64-bit, 14A but I have tried 14B as well.
>>>
>>> Any clues? Dump from ets:i() and erlang:memory() is like below.
>>>
>>> Cheers,
>>>
>>> Petter
>>>
>>> --- dump ---
>>>
>>> eNumber of processes: 27
>>> ets:i():
>>> id name type size mem owner
>>>
>>> ----------------------------------------------------------------------------
>>> 13 code set 261 10692 code_server
>>> 4110 code_names set 58 7804 code_server
>>> 6746271765 the_synapses ordered_set 5425194 113336012<0.47.0>
>>> 7022018584 the_synapses ordered_set 15143493 310909950<0.48.0>
>>> 7774416922 the_synapses ordered_set 8794649 182005810<0.49.0>
>>> ac_tab ac_tab set 6 848
>>> application_controller
>>> file_io_servers file_io_servers set 0 302 file_server_2
>>> global_locks global_locks set 0 302
>>> global_name_server
>>> global_names global_names set 0 302
>>> global_name_server
>>> global_names_ext global_names_ext set 0 302
>>> global_name_server
>>> global_pid_ids global_pid_ids bag 0 302
>>> global_name_server
>>> global_pid_names global_pid_names bag 0 302
>>> global_name_server
>>> inet_cache inet_cache bag 0 302 inet_db
>>> inet_db inet_db set 29 571 inet_db
>>> inet_hosts_byaddr inet_hosts_byaddr bag 0 302 inet_db
>>> inet_hosts_byname inet_hosts_byname bag 0 302 inet_db
>>> inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 302
>>> inet_db
>>> inet_hosts_file_byname inet_hosts_file_byname bag 0 302
>>> inet_db
>>> neurone_counter neurone_counter set 258394 1846182 entity_server
>>> neurone_group_counter neurone_group_counter set 6 344
>>> entity_group_server
>>> neurone_group_name neurone_group_name set 6 426
>>> entity_group_server
>>> neurone_group_name_reverse neurone_group_name_reverse set 6
>>> 426 entity_group_server
>>> neurone_name neurone_name set 258394 11824602 entity_server
>>> neurone_name_reverse neurone_name_reverse set 258394 11824602
>>> entity_server
>>> memory(): [{total,5568669792},
>>> {processes,1138936},
>>> {processes_used,1128120},
>>> {system,5567530856},
>>> {atom,349769},
>>> {atom_used,336605},
>>> {binary,82704},
>>> {code,3046365},
>>> {ets,5562163256}]
>>>
>>>
>> Hi Peter, ETS tables are not garbage collected. Each ETS table has _one_
>> owner (a process). When that owner dies the table is deleted and it's
>> memory is reclaimed. You can also delete a table (and reclaim the memory)
>> by calling ets:delete/1. Looking at your memory result, your ETS tables are
>> taking up ~5.2GB of data. However, you binary usage is very low so I'm
>> going to take a guess that you are sotring a list of strings? If so you
>> should note that on a 64-bit system *each character* in a string will use 16
>> bytes of memory! I highly recommend using binaries where possible when
>> dealing with a large amount of data; your program will not only be more
>> space efficient but also faster. I've written a non-trivial Erlang
>> application for work and I deal with CSV files that get up to 18 million
>> rows. I make heavy use of binaries and the binary module to parse these
>> files and write entries to ETS--you'd be surprised how fast it is! If you'd
>> like I could provide an example.
>> When you say "thread" do you mean "process?" You do realize that an OS
>> thread and Erlang process are two completely different things. IIRC, the VM
>> spawn's an OS thread per scheduler (along w/ some other threads for I/O and
>> such). Erlang processes are extremely cheap...don't be afraid to make
>> thousands or even tens-of-thousands of them.
>> You should not have to perform manual garbage collection, that seems like a
>> code smell to me. When a process dies it's heap will be reclaimed. Each
>> process has it's own isolated heap.
>> Do you have multiple processes all writing to the same ETS table? If so
>> there are some improvements that were made to ETS (and Erlang in general)
>> for concurrent writing/reading of an ETS table in 14B that you might want to
>> look at.
>> Finally, it would be helpful to see the full source code. There is a good
>> chance your solution is not optimal for Erlang. By that, I mean that if
>> your translation follows closely from your Java and Perl solutions than
>> chances are it's not an optimal Erlang program as the paradigms are vastly
>> different.
>> -Ryan
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>
More information about the erlang-questions
mailing list