[erlang-questions] Performance question
Stu Bailey
stu.bailey@REDACTED
Fri Nov 7 18:50:31 CET 2014
Thank you for the feedback. That's very helpful.
On Fri, Nov 7, 2014 at 9:44 AM, Loïc Hoguin <essen@REDACTED> wrote:
> Based on the code at
>
> https://github.com/erlang/otp/blob/maint/lib/stdlib/src/binary.erl#L268
>
> It does a lot of splitting, and then a lot more splitting, and then call
> iolist_to_binary. It looks very inefficient.
>
> Your solution is the fastest way to do it. You also benefit from match
> context optimization and so your code is very fast. The only thing that
> could make it faster is if memory was allocated only once for the resulting
> binary (instead of realloc a few times)... but maybe there's already an
> optimization like this?
>
> On 11/07/2014 07:33 PM, Stu Bailey wrote:
>
>> FYI, if you want to try to replicate it, I was processing ~80 chunks of
>> binary where each chunk was about ~250,000,000 bytes. I think you'll
>> see the difference on just one chunk. I happen to running on a 8-core
>> MacBook Pro with 16GB Ram and therefore spawned a process per chunk to
>> grab all the resources on all the cores. With the hand written
>> function, it worked like a charm...yay Erlang! :-) I love seeing a few
>> lines of code effectively use all processing power available. Heats the
>> machine up quite a bit too. :-)
>>
>> On Fri, Nov 7, 2014 at 9:22 AM, Stu Bailey <stu.bailey@REDACTED
>> <mailto:stu.bailey@REDACTED>> wrote:
>>
>> I'm not planning to spend a lot of time on this right now, but the
>> binary:replace(...) was chewing a tremendous amount of system time
>> CPU load (and actually never finished before I got frustrated and
>> killed it) and my function was reporting the CPU load as 99% user
>> time (not system time) and finished in a reasonable time. I assume
>> the high system time usage for binary:replace(..) is because
>> binary:replace(...) is doing something manic with system calls for
>> memory management or something?
>>
>>
>> On Fri, Nov 7, 2014 at 1:44 AM, Loïc Hoguin <essen@REDACTED
>> <mailto:essen@REDACTED>> wrote:
>>
>> binary:split and binary:replace, unlike other functions of the
>> binary module, are normal Erlang functions. They also process a
>> list of options before doing the actual work, so there's an
>> obvious overhead compared to not doing that. In addition as has
>> been pointed out, your code is more specialized so that helps too.
>>
>> On 11/07/2014 03:33 AM, Stu Bailey wrote:
>>
>> I found
>>
>> binary:replace(BinChunk,<<"\n"__>>,<<>>,[global]).
>>
>> /significantly /slower than
>>
>> remove_pattern(BinChunk,<<>>,<__<"\n">>).
>>
>> with
>>
>> remove_pattern(<<>>,Acc,___BinPat) ->
>> Acc;
>> remove_pattern(Bin,Acc,BinPat)__->
>> <<Byte:1/binary,Rest/binary>> = Bin,
>> case Byte == BinPat of
>> true -> remove_pattern(Rest,Acc,__BinPat);
>> false ->
>> remove_pattern(Rest,<<Acc/__binary,Byte/binary>>,BinPat)
>> end.
>>
>> That was surprising to me. The built-in binary:replace()
>> was much much
>> slower for larger BinChunk with lots of <<"\n">> sprinkled
>> through.
>>
>> Thoughts?
>>
>>
>> _________________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED <mailto:erlang-questions@
>> erlang.org>
>> http://erlang.org/mailman/__listinfo/erlang-questions
>> <http://erlang.org/mailman/listinfo/erlang-questions>
>>
>>
>> --
>> Loïc Hoguin
>> http://ninenines.eu
>>
>>
>>
>>
> --
> Loïc Hoguin
> http://ninenines.eu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141107/dca2cbd0/attachment.htm>
More information about the erlang-questions
mailing list