[erlang-questions] list vs binary performarnce, destructuring and consing

Tue Oct 23 14:00:34 CEST 2012

On Oct 23, 2012, at 11:16 AM, Motiejus Jakštys <desired.mta@REDACTED> wrote:
> 
> FYI, with "native" and some more tricks:
> BIN: Min:  124959.0, Max:  152380.0, Mean:   80449.2, stdev:  53616.5
> STR: Min:   40768.0, Max:  340067.0, Mean:  137522.3, stdev:   7299.9
> 

There is no reason to believe binaries to be faster than lists for this benchmark. The list construction can take its cells from the process heap and since it is garbage collected, doing so is an (amortized) O(1) pointer move. The reverse takes O(N). So we are looking at N*O(1) + O(N) = O(N). The binary construction has to reallocate the binary once in a while which imposes a copy on the binary to make it larger. In the worst case, and when it is badly implemented, this is an O(N^2) operation. If it is better implemented, we can shave it down to O(N lg N) or thereabout, depending on the slack amount we want to allocate. But it does not seem as cheap as the other solution, barring constants.

If you want to do better at the stats, report the median or better, the 25th, 50th, 75th, 90th, 95th and 99th percentile. I don't think the mean can be used for anything since we don't know if the thing we are looking at happens to be a normal distribution. Or you could plot the kernel density. Just taking

tiefling=; dc -e '340067 40768 + 2 / p'
190417

does suggest that something is fishy since this is way away from the mean.

Jesper Louis Andersen
  Erlang Solutions Ltd., Copenhagen