[erlang-questions] Re: data sharing is outside the semantics of Erlang, but it sure is useful

Robert Virding rvirding@REDACTED
Mon Sep 14 23:06:29 CEST 2009


Ah ok, then I understand you. Well I would class that as an unnecessary
misfeature, file a bug report and ask them to change it. (Unnecessary as you
don't really need to put in the new element)

I can just point out that neither sets, ordsets nor rbsets do that, they all
leave the original element in. :-)

Robert

2009/9/14 James Hague <james.hague@REDACTED>

> > I am missing something here. gb_sets (nor sets, ordsets, rbsets) does not
> > make a copy of the data which is put into the set. All that is copied is
> > enough of the *tree* to insert the new element. There is no need to copy
> the
> > new data as it is kept within the same process. Only ets makes a copy of
> the
> > data.
>
> Let's say you've got a long list of strings.  Many of them duplicates.
> You don't just want to remove the duplicates because that will change
> the length of the list. The goals is to ensure that identical strings
> are shared, so there's only one copy in memory.  What's a practical
> way of doing that?
>
> This is irrelevant most of the time, but there are some situations
> where it's a huge win.
>
> (My solution was to build a new list by adding each element to a
> binary tree.  If a string is already in the tree, return the version
> that's already there (which is not something that gb_sets does).  In
> the resulting list, elements are shared as much as possible. I'm
> clearly taking advantage of how the runtime works, but it shrunk the
> heap size by tens of megabytes.)
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>


More information about the erlang-questions mailing list