[erlang-questions] Cost of Copy

Brentley Jones the.ajarn@REDACTED
Sun May 9 17:05:18 CEST 2010


I have no answers, but I too would love to see what people say.

- Brentley Jones

On May 8, 2010, at 2:43 AM, Henning Diedrich <hd2010@REDACTED>  
wrote:

> 1
>
> I am trying to get a better picture of what the performance costs  
> are when mutating (collections of) complex structures.
>
> Most generally, if I have a list of structures and want to make one  
> change to one of the structures; and if I want to avoid ETS for the  
> implicit copying, what else should I NOT do to not run into the same  
> problem with a non-ETS approach, and how CAN I have as little  
> unnecessary internal copying of all those things that stay the same.
>
> I just ran into this post by J Louis while looking for information
>
> http://jlouisramblings.blogspot.com/2009/01/common-erlang-misconceptions.html
>
> It was giving some implicit answers about what results in internal  
> copying, what doesn't. And I am not talking about strings here.
>
> I wondered if there is a dedicated write up out there that explains  
> a little more to the point what the use of lists, sets, trees, ets  
> and mnesia implies in terms of copying, to better understand the  
> performance implications. With immutable variables, of course there  
> is always the hope that underneath, not much copying is actually  
> necessary, as I understand e.g. lists are benefiting hugely from, to  
> become O(1) when adding heads to them. So assumptions based on ALGOL  
> heritage experiences should fail. Which puts one on a quest like Max  
> Lapshin's  http://erlang.2086793.n4.nabble.com/Help-with-storing-data-in-memory-td2119208.html#a2119208 
>  who is also really looking for a way to avoid implicit copying.
>
> I have been looking for clues in this direction all the time but  
> found them rather scarce in the docs. May be me. May be an idea to  
> put more focus on in erldocs, erlang.org etc? I am sure there are  
> links. Maybe they could be featured more prominently. I am sure  
> there are many people coming to Erlang with a deeply ingrained eye  
> on performance and would like to understand these implications  
> earlier in the learning experience.
>
> I mean things exactly like
>
> "changing status from x to y would copy the record, BUT the actual  
> record structure is only a set of pointers." by Ulf Wiger in his post
>
> http://www.erlang.org/cgi-bin/ezmlm-cgi/4/1342 (var names changed)
>
> 2
>
> From that post I understand that the best performance should come  
> from a data structure that is as nested as possible, because that  
> will reduce the amount of pointers that are copied with every  
> mutation?!
>
> As the mutation copies not-changed data only on one level. That's a  
> very interesting implication that so far I haven't found spelled out  
> anywhere (but would have liked to learn up front, if true). I am  
> still not sure if it would hold across the board. Or is it wrong for  
> other reasons I am not aware of? (Or could this post maybe for any  
> strange reason be ... outdated? It's a decade old in June. Wow!)
>
> 3
>
> Surveying the choices, does this imply I am to use a copy-heavy  
> approach if I want to have meaningful transaction support between  
> processes? Because the choice there is ets, dets or mnesia and they  
> all copy the entire structure everytime there is any access to it?
>
> 4
>
> I would encourage a view that these kind of questions would be best  
> discussed quite early in the learning curve. Together with the  
> implications that immutable state has for message passing and how it  
> saves so much copying in that case. Or so I speculate. I am not  
> speaking about the error prevention aspects here.
>
> That said, thanks for all the efforts that are going into making the  
> Erlang docs ever better! Even Google today adopted the classic  
> format of the Erlang doc frame layout I realize. But seriously,  
> thanks!
>
> Henning
>
>


More information about the erlang-questions mailing list