[erlang-questions] lists, binaries or something else?

Fri Jul 13 18:43:25 CEST 2012

If the chunks are big enough you will benefit from sending them as binary
since there won't be any copying. As others have already said, you probably
will benefit most from either an iolist or all binary scheme.

Benchmarking adding characters at a time isn't really equivalent to adding
chunks at a time, especially for large chunk sizes. Of course, adding a
chunk (or even a list if chunks) to either side of an iolist is O(1), so if
you're mostly just building these things you're really better off leaving
it like that as long as possible.

On Friday, July 13, 2012, CGS wrote:

> Yes, you are right. When I did the test, I thought also about that, so, I
> made a third loop which should have appended the char to the binary string.
> Unfortunately, I had a typo there and I ended up having the same loop for
> binaries (and therefore the same numbers). Your message made me check it
> again and I found that typo. I didn't find quite those ratios you got, but,
> yes, they are comparable with your ratios. Thanks a lot.
>
>
>
> On Fri, Jul 13, 2012 at 4:51 PM, Anders Nygren <anders.nygren@REDACTED>wrote:
>
> For some cases appending to a binary is much faster, see
> http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id65923
>
> So Your simple benchmark is not fair, You are comparing the best case
> for lists with the worst case for binaries. If You change it to
> appending instead of prepending You will get very different results.
> So as one of the first replies said "it depends".
>
> On my machine:
> For small N prepending to a list is ~2x faster than appending to a binary.
> When N=1E6 appending to a binary is ~2x faster than prepending to a list.
>
> /Anders
>
>
> On Fri, Jul 13, 2012 at 6:48 AM, CGS <cgsmcmlxxv@REDACTED> wrote:
> > Hi Bob,
> >
> > On Fri, Jul 13, 2012 at 2:32 AM, Bob Ippolito <bob@REDACTED> wrote:
> >>
> >> Sorry, but that's not enough information.
> >
> >
> > Sorry, but I didn't know what you meant by that. Here are the answers.
> >
> >>
> >>
> >> Where do these chunks come from: source code, some other process, ets?
> >
> >
> > Mostly from other processes.
> >
> >>
> >> How many chunks are in a string?
> >
> >
> > That is computed from the string size and chunk size (to be decided
> later).
> >
> >>
> >> Do you compose strings out of other strings, or just these chunks?
> >
> >
> > Just chunks inserted into existing/new strings.
> >
> >>
> >> Are you constructing them from tail to head like you would a list?
> >
> >
> > Unfortunately, not all the time.
> >
> >>
> >> Is the string constructed all at once, or over some time?
> >
> >
> > If you mean by that the string will be fully given in the same message or
> > whatever by other processes, the answer is no. Over some time may be the
> > answer, but with the remark that I have no idea what means "over some
> time"
> > as period of time (can get chunks one after the other for the same
> string or
> > for different strings).
> >
> >>
> >>
> >> It sounds like you have some code that's doing something with these
> >> strings, because you say that it's faster to use lists than binaries.
> Maybe
> >> if you post whatever it is you used to determine that, someone can help
> you
> >> find a better algorithm and/or data structure for that benchmark.
> >
> >
> > That is simple. For benchmark I just took two simple loops (only for
> > insertion):
> >
> > -export([loop_list/1, loop_binary/2]).
> >
> > loop_list(0) -> [];
> > loop_list(N) -> [107 | loop_list(N)].
> >
> > loop_binary(0,B) -> B;
> > loop_binary(N,B) -> loop_binary(N-1,<<107,B/binary>>).
> >
> > If you go in powers of tens and measure the execution time, you will see
> the
> > difference (you will also notice the drop in efficiency for binaries when
> > you need fragmentation for the same string, which is not visible in the
> case
> > of lists - or at least I couldn't notice). That is not a fully conclusive
> > test for my case, but it is quite a conclusive test for processing speed
> in
> > the two cases (simple expansion of the string by adding a char at the
> > beginning of the string), in which, for a 10^5 chars string, the list
> gains
> > at least one order of magnitude in processing time than its binary
> > counterpart (1) (2) (3).
> >
> > (1) Even if you change the list loop to have an accumulator in which the
> > list exists, there is still one order of magnitude difference.
> > (2) The results are specific for the machine on which the tests are done.
> > (3) That is just an example.
> >
> > CGS
> >
> >
> >>
> >>
> >>
> >> On Thursday, July 12, 2012, CGS wrote:
> >>>
> >>> Unfortunately, the project is still in the planning stage, so, no real
> >>> code was written yet. Nevertheless, I plan some open source projects
> for
> >>> some parts of the project.
> >>>
> >>> About each string, it is constructed from chunks of fixe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120713/e3f3bbfc/attachment.htm>