[erlang-questions] lists, binaries or something else?

CGS cgsmcmlxxv@REDACTED
Mon Jul 16 11:03:57 CEST 2012


Agree. Thanks a lot, Bob!

CGS




On Fri, Jul 13, 2012 at 6:43 PM, Bob Ippolito <bob@REDACTED> wrote:

> If the chunks are big enough you will benefit from sending them as binary
> since there won't be any copying. As others have already said, you probably
> will benefit most from either an iolist or all binary scheme.
>
> Benchmarking adding characters at a time isn't really equivalent to adding
> chunks at a time, especially for large chunk sizes. Of course, adding a
> chunk (or even a list if chunks) to either side of an iolist is O(1), so if
> you're mostly just building these things you're really better off leaving
> it like that as long as possible.
>
>
> On Friday, July 13, 2012, CGS wrote:
>
>> Yes, you are right. When I did the test, I thought also about that, so, I
>> made a third loop which should have appended the char to the binary string.
>> Unfortunately, I had a typo there and I ended up having the same loop for
>> binaries (and therefore the same numbers). Your message made me check it
>> again and I found that typo. I didn't find quite those ratios you got, but,
>> yes, they are comparable with your ratios. Thanks a lot.
>>
>>
>>
>> On Fri, Jul 13, 2012 at 4:51 PM, Anders Nygren <anders.nygren@REDACTED>wrote:
>>
>> For some cases appending to a binary is much faster, see
>> http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id65923
>>
>> So Your simple benchmark is not fair, You are comparing the best case
>> for lists with the worst case for binaries. If You change it to
>> appending instead of prepending You will get very different results.
>> So as one of the first replies said "it depends".
>>
>> On my machine:
>> For small N prepending to a list is ~2x faster than appending to a binary.
>> When N=1E6 appending to a binary is ~2x faster than prepending to a list.
>>
>> /Anders
>>
>>
>> On Fri, Jul 13, 2012 at 6:48 AM, CGS <cgsmcmlxxv@REDACTED> wrote:
>> > Hi Bob,
>> >
>> > On Fri, Jul 13, 2012 at 2:32 AM, Bob Ippolito <bob@REDACTED> wrote:
>> >>
>> >> Sorry, but that's not enough information.
>> >
>> >
>> > Sorry, but I didn't know what you meant by that. Here are the answers.
>> >
>> >>
>> >>
>> >> Where do these chunks come from: source code, some other process, ets?
>> >
>> >
>> > Mostly from other processes.
>> >
>> >>
>> >> How many chunks are in a string?
>> >
>> >
>> > That is computed from the string size and chunk size (to be decided
>> later).
>> >
>> >>
>> >> Do you compose strings out of other strings, or just these chunks?
>> >
>> >
>> > Just chunks inserted into existing/new strings.
>> >
>> >>
>> >> Are you constructing them from tail to head like you would a list?
>> >
>> >
>> > Unfortunately, not all the time.
>> >
>> >>
>> >> Is the string constructed all at once, or over some time?
>> >
>> >
>> > If you mean by that the string will be fully given in the same message
>> or
>> > whatever by other processes, the answer is no. Over some time may be the
>> > answer, but with the remark that I have no idea what means "over some
>> time"
>> > as period of time (can get chunks one after the other for the same
>> string or
>> > for different strings).
>> >
>> >>
>> >>
>> >> It sounds like you have some code that's doing something with these
>> >> strings, because you say that it's faster to use lists than binaries.
>> Maybe
>> >> if you post whatever it is you used to determine that, someone can
>> help you
>> >> find a better algorithm and/or data structure for that benchmark.
>> >
>> >
>> > That is simple. For benchmark I just took two simple loops (only for
>> > insertion):
>> >
>> > -export([loop_list/1, loop_binary/2]).
>> >
>> > loop_list(0) -> [];
>> > loop_list(N) -> [107 | loop_list(N)].
>> >
>> > loop_binary(0,B) -> B;
>> > loop_binary(N,B) -> loop_binary(N-1,<<107,B/binary>>).
>> >
>> > If you go in powers of tens and measure the execution time, you will
>> see the
>> > difference (you will also notice the drop in efficiency for binaries
>> when
>> > you need fragmentation for the same string, which is not visible in the
>> case
>> > of lists - or at least I couldn't notice). That is not a fully
>> conclusive
>> > test for my case, but it is quite a conclusive test for processing
>> speed in
>> > the two cases (simple expansion of the string by adding a char at the
>> > beginning of the string), in which, for a 10^5 chars string, the list
>> gains
>> > at least one order of magnitude in processing time than its binary
>> > counterpart (1) (2) (3).
>> >
>> > (1) Even if you change the list loop to have an accumulator in which the
>> > list exists, there is still one order of magnitude difference.
>> > (2) The results are specific for the machine on which the tests are
>> done.
>> > (3) That is just an example.
>> >
>> > CGS
>> >
>> >
>> >>
>> >>
>> >>
>> >> On Thursday, July 12, 2012, CGS wrote:
>> >>>
>> >>> Unfortunately, the project is still in the planning stage, so, no real
>> >>> code was written yet. Nevertheless, I plan some open source projects
>> for
>> >>> some parts of the project.
>> >>>
>> >>> About each string, it is constructed from chunks of fixe
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120716/856cb3f6/attachment.htm>


More information about the erlang-questions mailing list