[erlang-bugs] Freeze in re:run

Miroslav Urbanek mu@REDACTED
Fri May 22 00:42:03 CEST 2015


Hi,

I'm using Erlang 17.5 and I'm experiencing freezes in re:run that take
tens of seconds. I've isolated the problematic part into the following
code:

----
#!/usr/bin/env escript

log(Term) ->
    {_ , {Hour, Minute, Second}} = calendar:now_to_universal_time(os:timestamp()),
    io:format("~2..0w:~2..0w:~2..0w ~p~n", [Hour, Minute, Second, Term]).

test(Filename) ->
    {ok, IoDevice} = file:open(Filename, [binary, read]),
    {ok, Data} = file:read(IoDevice, 1048576),
    Regexp = "(?i)(?m)(?s)(?U)(?<=\n)--(?|.*^content-disposition:(?:(?!\n\\S).)*filename=\"[^\"]*\\.([^\".]*)\".*|.*^content-type:(?:(?!\n\\S).)*name=\"[^\"]*\\.([^\".]*)\").*\n\n",
    re:run(Data, Regexp, [report_errors]).

main([Filename]) ->
    spawn_link(fun Fun() ->
                       log(heartbeat),
                       timer:sleep(1000),
                       Fun()
               end),
    log(test(Filename)).
----

Processing the complex regexp above takes about 50 seconds for my test
data. Sometimes, the whole Erlang VM freezes for several seconds. The
output then looks like this:

----
$ ./test.erl test.eml
18:27:48 heartbeat
18:27:49 heartbeat
18:27:50 heartbeat
18:27:51 heartbeat
18:27:52 heartbeat
18:27:53 heartbeat
18:27:54 heartbeat
18:27:55 heartbeat
18:27:56 heartbeat
18:27:57 heartbeat
18:27:58 heartbeat
18:27:59 heartbeat
18:28:00 heartbeat
18:28:01 heartbeat
18:28:02 heartbeat
18:28:03 heartbeat
18:28:04 heartbeat
18:28:05 heartbeat
18:28:06 heartbeat
18:28:07 heartbeat
18:28:08 heartbeat
18:28:36 heartbeat
18:28:36 nomatch
----

The VM froze between 18:28:08 and 18:28:36. The manual page re(3)
states that "re:run always give control back to the scheduler of
Erlang processes at intervals that ensures the real time properties of
the Erlang system". I believe this is a bug, because re:run clearly
didn't give control back to the scheduler during the interval above.

Why does it happen? Is there any option to limit the number of time VM
spends in re:run?

The program and the data file can be downloaded here:

http://miroslavurbanek.com/re.tgz

Thanks,
Miro



More information about the erlang-bugs mailing list