From mickael.remond@REDACTED Fri Jan 13 12:23:16 2006 From: mickael.remond@REDACTED (Mickael Remond) Date: Fri, 13 Jan 2006 12:23:16 +0100 Subject: Supervisor: Dynamic children performance improvement Message-ID: <20060113112316.GA16474@memphis.ilius.fr> Hello, We have changed the behaviour of the supervisor to improve the performance when you have a lot of dynamic supervised children (several thousands). The patch and its description is available from: http://support.process-one.net/doc/display/CONTRIBS/Supervisor+-+Performance+improvement+for+dynamic+workers It uses an Erlang dictionary instead of lists: The supervisor children list update are thus much more efficient. It has been used in production systems and lowers our CPU consumption when dealing with lot of supervised connection process. I thought it could be a nice addition to Erlang/OTP. Best wishes, -- Micka?l R?mond http://www.process-one.net/ From gunilla@REDACTED Fri Jan 13 13:35:26 2006 From: gunilla@REDACTED (Gunilla Arendt) Date: Fri, 13 Jan 2006 13:35:26 +0100 Subject: Supervisor: Dynamic children performance improvement References: <20060113112316.GA16474@memphis.ilius.fr> Message-ID: Thanks, We'll look into it. Regards, Gunilla Mickael Remond wrote: > Hello, > > We have changed the behaviour of the supervisor to improve the > performance when you have a lot of dynamic supervised children (several > thousands). > > The patch and its description is available from: > http://support.process-one.net/doc/display/CONTRIBS/Supervisor+-+Performance+improvement+for+dynamic+workers > > It uses an Erlang dictionary instead of lists: The supervisor children > list update are thus much more efficient. > > It has been used in production systems and lowers our CPU consumption > when dealing with lot of supervised connection process. > > I thought it could be a nice addition to Erlang/OTP. > > Best wishes, > -- _____Gunilla Arendt______________________________________________ EAB/UKI/O OTP Design Gunilla.Arendt@REDACTED +46-8-7275730 ecn 851 5730 From alexey@REDACTED Tue Jan 24 02:52:39 2006 From: alexey@REDACTED (Alexey Shchepin) Date: Tue, 24 Jan 2006 03:52:39 +0200 Subject: Patch for increasing simple_one_for_one supervisors efficiency Message-ID: <87ek2yz920.fsf@alex.sevcom.net> Hi! Please, apply this patch to supervisor.erl to increase efficiency of simple_one_for_one supervisors. Currently #state.dynamics uses orddict-similar implementation, and this patch replaces it with implementation that use dict module. Without this patch supervisor was the biggest bottleneck during some ejabberd benchmarks. -------------- next part -------------- A non-text attachment was scrubbed... Name: supervisor.patch Type: text/x-patch Size: 3065 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 1930 bytes Desc: not available URL: From alexey@REDACTED Tue Jan 24 02:55:00 2006 From: alexey@REDACTED (Alexey Shchepin) Date: Tue, 24 Jan 2006 03:55:00 +0200 Subject: Patch for increasing simple_one_for_one supervisors efficiency Message-ID: <87d5iiz8y3.fsf@alex.sevcom.net> Hi! Sorry, I just found that Mickael already posted it :) -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 1930 bytes Desc: not available URL: From serge@REDACTED Sun Jan 29 17:03:08 2006 From: serge@REDACTED (Serge) Date: Sun, 29 Jan 2006 11:03:08 -0500 Subject: heart does not restart node launched with run_erl In-Reply-To: <20060104212757.8E03746DDE@bang.trapexit.org> References: <20060104212757.8E03746DDE@bang.trapexit.org> Message-ID: <43DCE73C.2090300@corp.idt.net> We happened to resolve this issue by handling SIGCHLD in run_erl. When run_erl is executing $HEART_COMMAND that includes erl with a -heart option: 'run_erl ... "erl ... -heart"', the following is observed: 1. run_erl starts erl 2. erl starts heart 3. heart monitors erl If erl gets killed or exits, then 1. heart restarts HEART_COMMAND 2. new run_erl detects an active UDS (owned by old run_erl) and exits 3. heart gets terminated (since it restarted the HEART_COMMAND) 4. old run_erl gets terminated as well (I don't recall right now what triggers its termination) At the end we end up with no Erlang running. Attached is a patch to run_erl that addresses this issue by forcing run_erl to exit upon detecting the death of the node started by HEART_COMMAND. Note that this patch also includes the patch provided by Ernie Makris / Jouni Ryn? (news://news.gmane.org:119/025601c5cf6c$459cd1d0$4601a8c0@REDACTED) for RedHat ES 4.0 and Fedora. I hope it can be included in the next release. Regards, Serge erlang-questions@REDACTED wrote: > Hi all, > Ran into a weird problem. I have an embedded application that is started with run_erl from a .sh script. I also use heart to restart the application. HEART_COMMAND is set to launch the same start.sh script that was used to start the application initially. At the start, the process tree looks as follows: > > 3196 ? S 0:00 /home/drpdev/erts-5.4.10/bin/run_erl -daemon /home/drpdev/var/tmp/drp /home/drpdev/var/log/drp -exec /home/drpdev/bin/start_erl > 3202 pts/2 Ssl+ 0:02 _ /home/drpdev/erts-5.4.10/bin/beam -- -root /home/drpdev -progname drip -- -home /home/drpdev -boot /home/drpdev/releases/1. > 3222 ? Ss 0:00 _ heart -pid 3202 > 3227 ? Ss 0:00 _ inet_gethost 4 > 3228 ? S 0:00 | _ inet_gethost 4 > 3229 ? Ss 0:00 _ sh -s disksup > > To test the restart, I kill pid 3202 and see the following: > > 3222 ? Ss 0:00 heart -pid 3202 > 3196 ? S 0:00 /home/drpdev/erts-5.4.10/bin/run_erl -daemon /home/drpdev/var/tmp/drp /home/drpdev/var/log/drp -exec /home/drpdev/bin/start_erl > 3202 ? Zs 0:02 _ [beam] > > > Next, heart launches the script: > > 3253 ? S 0:00 /bin/bash /home/drpdev/bin/drip.sh start > 3272 ? S 0:00 _ sleep 3 > 3196 ? S 0:00 /home/drpdev/erts-5.4.10/bin/run_erl -daemon /home/drpdev/var/tmp/drp /home/drpdev/var/log/drp -exec /home/drpdev/bin/start_erl > 3202 ? Zs 0:02 _ [beam] > > The sleep 3 is right before it calls the run_erl command to start the embedded application. Note that the old run_erl (pid 3196) is still hanging around although the node itself (pid 3202) is defunct. > > When drip.sh calls run_erl, the old run_erl (pid 3196) goes away, but no new run_erl process appears. Application is not started either. erlang.log.1 does not showI see the following in the run_erl.log: > > ------- > Pty master read; run_erl [3196] Wed Jan 4 15:59:37 2006 > Pty master read; run_erl [3196] Wed Jan 4 16:00:46 2006 > Pty master read; run_erl [3196] Wed Jan 4 16:00:51 2006 > Pty master read; run_erl [3279] Wed Jan 4 16:00:54 2006 > /home/drpdev/erts-5.4.10/bin/run_erl: pid is : 3279 > run_erl [3196] Wed Jan 4 16:00:54 2006 > FIFO read; run_erl [3196] Wed Jan 4 16:00:54 2006 > OK > run_erl [3196] Wed Jan 4 16:00:54 2006 > Pty master read; run_erl [3196] Wed Jan 4 16:00:54 2006 > Pty master read; run_erl [3196] Wed Jan 4 16:00:54 2006 > Pty master read; run_erl [3196] Wed Jan 4 16:00:54 2006 > Erlang closed the connection. > ------- > > I am curious why new run_erl (pid 3279) process did not start. Also, why did the old run_erl (pid 3196) did not terminate until the new run_erl attempted to start? I verified that this is not a coincidence - old run_erl will remain hanging in the process list until a new run_erl is started. > > Please, let me know if anyone else experienced similar issue. If needed I can provide additional info/config files, but not sure at this point which ones. > > Thank you. > Dmitry Korsun > IDT Corp. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: run_erl.patch URL: