[erlang-questions] Why Beam.smp crashes when memory is over?

Tony Rogvall tony@REDACTED
Mon Nov 9 15:38:02 CET 2009


Hi!

On 9 nov 2009, at 14.54, Angel Alvarez wrote:

> Well still there are many issues with this new approach
>
Yes!  But it does not scare me ;-)

> Where are the maibox of processes located?
>
> With a heap pre process...

Depends on the implementation. But in general you could do something  
like, If the the data is shared then
you split the share (memory_size/ref_count). If the data is copied  
then you must count it in.

>
> Couldnt you trigger a memory exception on a remote process by just  
> sending one message
> when the process is almost consuming its reserved memory?
>
Yes. But that is the point. If it pass the limit the process will die.
There are many special cases where you could think of using the memory  
in a better and more optimal way.
Lets say you are reaching the memory limit you may switch to a  
compression algorithm for heap memory !?
But lets keep it simple for the prototype, and see if it useful.


> Systems other than embedded erlangs deploy ( form de current erlang  
> movement as a server/desktop plataform)
> will suffer from resource contention beetween erlang VM and other OS  
> processes.
>
> Port programs also need system resources...
>
For loadable drivers using driver_alloc, one could possibly do  
something, otherwise it will be
up to the driver designer to handle it. There is a max_ports in the  
prototype that limits number of
open_ports. If sockets/files are mapped to single ports then it may  
help a bit.


> Well in the end your approach is still very interesting as a  
> framework for continous erlang VM innovations...
>
Thanks.

> but please correct me if im wrong but I saw that memory carriers  
> allowed to set several options on erlang VM start-up so,
>
I am not sure what you mean here?

/Tony

> is stil posible to pacth those carriers to allow a safe memory  
> reservation to let de VM manage properly a memory full
> condition by killing the offending process (sort of a OOM killer for  
> the VM)?
>
> Just telling the VM not to "kill system process" and let the  
> supervisors do the work...
>



> /Angel
>
>
>
> El Lunes, 9 de Noviembre de 2009 Tony Rogvall escribió:
>>
>> It is still the same "let it crash" concept using the resource limit
>> system I am designing.
>> But you can limit the crash in a more controlled way. Also you will  
>> be
>> able to report
>> interesting information about what is crashing and when.
>>
>> There is sometimes an issue when big systems crash. The restart may
>> take a lot of time.
>> Nodes must be synchronised, database tables must be repaired etc etc.
>> I guess you can design this to be light and easy, but it is not  
>> always
>> the case.
>>
>> /Tony
>>
>>
>>
>> On 9 nov 2009, at 12.43, Angel Alvarez wrote:
>>
>>>
>>> Well please let me say something
>>>
>>> I'm plain new but some things are pretty clear for me.
>>>
>>>
>>> The beauty of the erlang concept is "let it crash" , "don't program
>>> defensively"
>>> so the VM and the underlaying hardware are entities that can fail,
>>> that's it.
>>>
>>> What's the problem so?
>>>
>>> Joe said...
>>>
>>> If you want failure tolerance you need at least two nodes...
>>>
>>> From J.A thesis
>>>  " ...Schneider [60, 59] answered this question by giving three
>>> properties
>>> that he thought a hardware system should have in order to be
>>> suitable for
>>> programming a fault-tolerant system. These properties Schneider
>>> called:..."
>>>
>>> 1. Halt on failure — in the event of an error a processor should  
>>> halt
>>>     instead of performing a possibly erroneous operation.2
>>>
>>> So on memory exhaustation the VM has to die and other node (erlang)
>>> will do the recovery.
>>>
>>> that's the distrirbution role, no only to span computations over
>>> several nodes to enhance performance
>>> but to provide resilence in the presence of fatal errors (non
>>> correctable).
>>>
>>> As a OS process the VM has to compete with other OS processes so in
>>> a shared deployment (a VM running
>>> on a server or a desktop) you cant be safe agaisnt a OOM trigered by
>>> other entities.
>>>
>>> Such resource control thing wil only augment process overhead and
>>> context switching in the VM.
>>>
>>> People new to erlang will be atracted to this hierarquical
>>> decomposition of tasks as joe stated in his thesis
>>> "If you cant run certaing task try doing something simpler"
>>>
>>> Many languages and VM's are incorporating erlang's good "multicore"
>>> features but not the erlang powerfull error handling concept
>>> and you guys want to kill the simpliticy incorporating many
>>> defensive capabilities to avoid fatality instead of just organize
>>> code to
>>> handle such fatality.
>>>
>>> ¿whats next?,  ¿A mailbox maximum message queue control?
>>>
>>>
>>> Well, that's all i have to say about that, Forrest Gump.
>>>
>>>
>>> El Lunes, 9 de Noviembre de 2009 09:45:10 Tony Rogvall escribió:
>>>> Interesting discussion!
>>>>
>>>> I have been working on a resource system for Erlang for nearly two
>>>> years now.
>>>> I have a working (tm) prototype where you can set resource limits
>>>> like
>>>> max_processes/max_ports/max_memory/max_time/max_reductions ...
>>>> The limits are passed with spawn_opt and are inherited by the
>>>> processes spawned.
>>>> This means that if you spawn_opt(M,F,A[{max_memory, 1024*1024}])  
>>>> the
>>>> process
>>>> will be able to use 1M words for it self and it's "subprocesses".
>>>> This
>>>> also means
>>>> that the spawner will get 1M less to use (as designed right now).
>>>> If a
>>>> resource limit
>>>> is reached the process crash with system_limt reason.
>>>>
>>>> There are still some details to work out before a release, but I  
>>>> will
>>>> try to get it ready before
>>>> end of this year.
>>>>
>>>> /Tony
>>>>
>>>>
>>>>
>>>> On 9 nov 2009, at 09.16, Robert Virding wrote:
>>>>
>>>>> No.
>>>>>
>>>>> There is a major difference between handling OOM in an OS and in  
>>>>> the
>>>>> BEAM.
>>>>> In an OS it usually at a per process level that memory runs out so
>>>>> it is
>>>>> easy to decide which process to kill so that the OS can  
>>>>> continue. In
>>>>> the
>>>>> BEAM, however, it is the VM as a whole which has run out of memory
>>>>> not a
>>>>> specific, it is. therefore, much more difficult to work out which
>>>>> process is
>>>>> the culprit and to decide what to do. For example it might be that
>>>>> the
>>>>> process which causes the OOM is not the actual problem process, it
>>>>> might
>>>>> just the last straw. Or the actual cause may that the whole app
>>>>> might be
>>>>> generating large binaries too quickly. Or it might be that the  
>>>>> whole
>>>>> app is
>>>>> spawning to many processes without any one process being the  
>>>>> cause.
>>>>> Or ...
>>>>> In all these cases killing the process which triggered the OOM  
>>>>> would
>>>>> be the
>>>>> Wrong Thing. We found that it was difficult to work out a  
>>>>> reasonable
>>>>> strategy to handle the actual cause so we decided not handle it.
>>>>>
>>>>> "Don't catch an error which you can't handle" as the bard put it.
>>>>>
>>>>> Robert
>>>>>
>>>>> 2009/11/9 Max Lapshin <max.lapshin@REDACTED>
>>>>>
>>>>>> Yes, there are techniques to write watchdogs, but my question
>>>>>> was: is
>>>>>> it possible to prevent Erlang VM from crash?
>>>>>>
>>>>>> ________________________________________________________________
>>>>>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>>>>>> erlang-questions (at) erlang.org
>>>>>>
>>>>>>
>>>>
>>>>
>>>> ________________________________________________________________
>>>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>>>> erlang-questions (at) erlang.org
>>>>
>>>>
>>>
>>>
>>> Este correo no tiene dibujos. Las formas extrañas en la pantalla son
>>> letras.
>>> __________________________________________
>>>
>>> Clist UAH a.k.a Angel
>>> __________________________________________
>>> Warning: Microsoft_bribery.ISO contains OOXML code.
>>>
>>> ________________________________________________________________
>>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>>> erlang-questions (at) erlang.org
>>>
>>
>>
>
>
>
> -- 
> No imprima este correo si no es necesario. El medio ambiente está en  
> nuestras manos.
> __________________________________________
>
> Clist UAH a.k.a Angel
> __________________________________________
> China 'limpia' el Tibet para las Olimpiadas.



More information about the erlang-questions mailing list