[erlang-questions] NIF vs Erlang Binary

Fri Jul 22 15:58:30 CEST 2011

On Jul 22, 2011, at 7:03 AM, Jesper Louis Andersen wrote:

> On Fri, Jul 22, 2011 at 11:45, Andy W. Song <wsongcn@REDACTED> wrote:
> 
>> I did some unit test on my code and felt that it's slow (it can process
>> about  24M byte/s) on a virtual machine. HiPE can double the performance but
>> still not quite enough. So I wrote an NIF to handle this. The speed is about
>> 10~15x faster. Not only that, I feel that the C code is easier to write.
> 
> Blindly unrolling the Key a bit gives a factor of 3 speedup:
> <snip>
> Now it is 5 times faster, same result. The NIF-advantage is now a
> factor of 2-3. That is in the ballpark I would expect it to be. You
> are doing many more reallocations with the above solution. Then the C
> NIF version. What happens if we tune it some more? Lets do runs of
> 8192 bits at a time...
> 
> 9 times faster compared to the original here! I expect our speed will
> converge to that of C if we turn it up even more and get the amount of
> allocation/realloc/concatenation down.

Just for fun, I did a test with a version of mask based on binary comprehensions. Here's the code:

mask1(Key, Data) ->
    S = size(Data) div 4 * 4,
    <<D1:S/binary, T1/binary>> = Data,
    D2 = << <<(X bxor Key):32>> || <<X:32>> <= D1 >>,
    T2 = handle_tail(T1, <<Key:32>>),
    <<D2/binary, T2/binary>>.

handle_tail(<<A:24>>, <<K:24, _:8>>) -> <<(A bxor K):24>>;
handle_tail(<<A:16>>, <<K:16, _:16>>) -> <<(A bxor K):16>>;
handle_tail(<<A:8>>, <<K:8, _:24>>) -> <<(A bxor K):8>>;
handle_tail(<<>>, _) -> <<>>.

The speed gain is disappointingly small, it shaves roughly 40% of Andy's original times.
I suspect that's because the comprehension is just syntactic sugar on top of recursive loops.

Anyway, just thought I'd share the results :)

Mihai