[erlang-questions] Re: Shared/Hybrid Heap

Zoltan Lajos Kis kiszl@REDACTED
Fri Nov 5 19:14:11 CET 2010


While a shared/hybrid heap on a global VM level does involve locking, 
pauses, etc.,
wouldn't a schedluer-level shared heap be free of these problems?

I have no much experience with the VM internals, but I suppose that a 
running process can
manipulate its scheduler's heap for free (i.e. without locking). For 
example when sending a message
it could simply move the data structure from its own heap to the 
scheduler's, and only pass on the
reference. Also, the garbage collector would be "just another process" 
in this case, so I don't see why
it would affect the real-timeness of the VM.

I would also imagine that the global shared heap is easier to implement 
in a two-level fashion,
where processes share their scheduler's heap, and schedulers share the 
global shared heap.
Unfortunately, I have even less experience in shared heaps, and garbage 
collectors than in the VM itself.

Anyway, I think this would fit well the current spawning and migration 
procedures; and hope this message
was more, than just noise :o).

Regards,
Zoltan.


On 11/4/2010 1:50 AM, Robert Virding wrote:
> This question is much more complex that you would at first realise. Some thoughts (in no specific order):
>
> 1. Shared/hybrid heaps are wonderful BUT:
> - They are more difficult to implement in a real-time environment. Yes, this is a known problem with solutions but the benefits are less than you would expect. Real-time gc costs.
> - They don't port easily to multi-core systems as you suddenly start needing locks and stuff everywhere to control access or have complex schemes to get around this. Which copying doesn't.
>
> For some problems shared heaps are much better, for others much worse.
>
> 2. I would assert that the operations on immutable data structures cost less then you imagine.
>
> 3. From the way you describe it having data structures which are both mutable and immutable would very confusing. You would need to have two sets of operations on them and every application would have to keep track of when to use them.
>
> 4. Mutable data structures give you all the things you want to avoid with global data, especially that they can change under your feet without you knowing about it. Yes, you may have to pass a reference around but it still means that I don't know who can access my bit of data and can change it. Yes, you can stipulate that the program should be well-behaved, but we all know how well that works in real-life in a big system.
>
> So while I have said that you don't *need* immutable data to get process isolation, instead you can specify data copying between processes, this does not mean that I recommend having mutable data. Immutable data is so much easier to work with and keep track of.
>
> Robert



More information about the erlang-questions mailing list