[erlang-questions] any way to speed up regex.split?

akonsu akonsu@REDACTED
Wed Dec 18 20:06:35 CET 2013


I have two benchmarks that perform a simple text split on a regular
expression. One is in Ruby and another is in Erlang. The Erlang version is
6 times slower on my machine for some reason. I have read all documentation
I could find on how to use binaries in Erlang, but I cannot make it faster.
I am looking for help.



Here is the code:

Ruby:

require 'benchmark'

n = 50000
text = "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
est laborum."

puts Benchmark.measure {
  n.times { text.split /\s+/ }
}


Erlang:

text() ->
    <<"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
est laborum.">>.

times(0, _) ->
    ok;
times(N, F) ->
    F(),
    times(N - 1, F).

measure(N) ->
    {ok, Pattern} = re:compile("\\s+"),
    B = text(),
    F = fun() -> re:split(B, Pattern) end,
    {T, ok} = timer:tc(?MODULE, times, [N, F]),
    T / 1000000.

Ruby outputs

  3.180000   0.000000   3.180000 (  3.182452)

Erlang outputs

Erlang R16B03 (erts-5.10.4) [source] [smp:2:2] [async-threads:10] [hipe]
[kernel-poll:false]

Eshell V5.10.4  (abort with ^G)
1> test:measure(50000).
18.261952
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131218/46469a98/attachment.htm>


More information about the erlang-questions mailing list