[erlang-questions] no next heap size found: 18446744071896091830, offset 0

Gustav Simonsson gustav.simonsson@REDACTED
Thu Jan 5 11:46:04 CET 2012


I'm not sure how relevant this is since it was due to a problem which was fixed
in the R14B release, and you run R14B03, but I thought it might be useful to post 
it anyway for reference.

It occurred on R13B04 on 32-bit Solaris 9 when an Erlang system ran out of disk space
and mnesia could not write to logs.

The crash was due to a bug in the virtual heap size calculations when a
garbage collection cycle was triggered for binary values.
There was a bugfix submitted for this problem in R14B:
OTP-8730 "Reduce the risk of integer wrapping in bin vheap size counting."

Wed Sep 21 15:57:37 2011
Slogan: no next heap size found: -12451840, offset 0
System version: Erlang R13B04 (erts-5.7.5) [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
Compiled: Tue Feb 23 02:14:48 2010
Atoms: 14907
total: 68052336
processes: 6551620
processes_used: 6548052
system: 61500716
atom: 660801
atom_used: 650690
binary: 46042376
code: 3280578
ets: 10677116

Garbing process:

State: Garbing
Spawned as: proc_lib:init_p/5
Last scheduled in for: dets:foldl_bins/2
Spawned by: <0.64.0>
Started: Wed Sep 21 15:57:37 2011
Message queue length: 0
Number of heap fragments: 1
Heap fragment data: 398
Link list: [#Port<0.2027>, <0.65.0>, <0.64.0>, {from,<0.146.0>,#Ref<>}]
Reductions: 16937
Stack+heap: 196418
OldHeap: 196418
Heap unused: 80
OldHeap unused: 196418
Program counter: 0x00722330 (dets:foldl_bins/2 + 4)
CP: 0x00000000 (invalid)

Gustav Simonsson

Sent from my PC

----- Original Message -----
From: "Scott Lystig Fritchie" <fritchie@REDACTED>
To: erlang-questions@REDACTED
Sent: Thursday, 5 January, 2012 4:23:38 AM
Subject: [erlang-questions] no next heap size found: 18446744071896091830,	offset 0

Hi.  This was a fun one coming from an OpenSolaris 64-bit box running
Erlang/OTP R14B03.  Searching my archive of erlang-questions and
erlang-bugs mailing lists hasn't found a match on 'no next heap size'
string ... except for a formatting bug where the heap size is
reported as negative.

    [root@REDACTED /var/log/riak]# head -20 erl_crash.dump.slf.0
    Thu Jan  5 01:03:52 2012
    Slogan: no next heap size found: 18446744071896091830, offset 0
    System version: Erlang R14B03 (erts-5.8.4) [source] [64-bit] [smp:16:16] [rq:16] [async-threads:571] [kernel-poll:true]
    Compiled: Mon Jul 25 18:05:12 2011
    Taints: eleveldb,crypto,bitcask_nifs
    Atoms: 14581
    total: 34190115856
    processes: 33980165896
    processes_used: 33980126736
    system: 209949960
    atom: 986689
    atom_used: 982563
    binary: 154769656
    code: 9219672
    ets: 712592
    size: 9643
    used: 7479

Riak was running at the time, and a zillion error messages were
generated & sent to the error logger by gen_gsm and handled by Andrew
Thompson's "lager" application.  The 'lager_crash_log' process ended up
with 31K messages in its Erlang mailbox, which is quite a lot ... and
the app was generating a few thousand error messages per second, which
meant that we were probably going to run out of memory anyway due to
Erlang mailbox growth.

However, having a memory allocation fail for the reason shown above
isn't good.  Has anyone else seen this error on R14B03?  Or later?


P.S. Here's the scoop on the memory hog proc:

    State: Garbing
    Name: lager_crash_log
    Spawned as: proc_lib:init_p/5
    Last scheduled in for: io_lib_format:iolist_to_chars/1
    Spawned by: <0.40.0>
    Started: Fri Dec 30 00:29:38 2011
    Message queue length: 31694
    Number of heap fragments: 0
    Heap fragment data: 0
    Link list: [#Port<0.33849890>, <0.40.0>]
    Reductions: 8749820516
    Stack+heap: 4148490785
    OldHeap: 0
    Heap unused: 3072286602
    OldHeap unused: 0
    Program counter: 0x000000000386f1c0 (io_lib_format:iolist_to_chars/1 + 8)
    CP: 0x0000000000000000 (invalid)
erlang-questions mailing list

More information about the erlang-questions mailing list