[erlang-bugs] Freeze in re:run
Éric Pailleau
eric.pailleau@REDACTED
Fri May 22 17:43:18 CEST 2015
Hi,
Your regexp is compiled at each call.
Your should compile it in your shell and set it in your code by a define.
It is probably not your problem, but it should increase perfs...
Regards
Le 22 mai 2015 00:42, Miroslav Urbanek <mu@REDACTED> a écrit :
>
> Hi,
>
> I'm using Erlang 17.5 and I'm experiencing freezes in re:run that take
> tens of seconds. I've isolated the problematic part into the following
> code:
>
> ----
> #!/usr/bin/env escript
>
> log(Term) ->
> {_ , {Hour, Minute, Second}} = calendar:now_to_universal_time(os:timestamp()),
> io:format("~2..0w:~2..0w:~2..0w ~p~n", [Hour, Minute, Second, Term]).
>
> test(Filename) ->
> {ok, IoDevice} = file:open(Filename, [binary, read]),
> {ok, Data} = file:read(IoDevice, 1048576),
> Regexp = "(?i)(?m)(?s)(?U)(?<=\n)--(?|.*^content-disposition:(?:(?!\n\\S).)*filename=\"[^\"]*\\.([^\".]*)\".*|.*^content-type:(?:(?!\n\\S).)*name=\"[^\"]*\\.([^\".]*)\").*\n\n",
> re:run(Data, Regexp, [report_errors]).
>
> main([Filename]) ->
> spawn_link(fun Fun() ->
> log(heartbeat),
> timer:sleep(1000),
> Fun()
> end),
> log(test(Filename)).
> ----
>
> Processing the complex regexp above takes about 50 seconds for my test
> data. Sometimes, the whole Erlang VM freezes for several seconds. The
> output then looks like this:
>
> ----
> $ ./test.erl test.eml
> 18:27:48 heartbeat
> 18:27:49 heartbeat
> 18:27:50 heartbeat
> 18:27:51 heartbeat
> 18:27:52 heartbeat
> 18:27:53 heartbeat
> 18:27:54 heartbeat
> 18:27:55 heartbeat
> 18:27:56 heartbeat
> 18:27:57 heartbeat
> 18:27:58 heartbeat
> 18:27:59 heartbeat
> 18:28:00 heartbeat
> 18:28:01 heartbeat
> 18:28:02 heartbeat
> 18:28:03 heartbeat
> 18:28:04 heartbeat
> 18:28:05 heartbeat
> 18:28:06 heartbeat
> 18:28:07 heartbeat
> 18:28:08 heartbeat
> 18:28:36 heartbeat
> 18:28:36 nomatch
> ----
>
> The VM froze between 18:28:08 and 18:28:36. The manual page re(3)
> states that "re:run always give control back to the scheduler of
> Erlang processes at intervals that ensures the real time properties of
> the Erlang system". I believe this is a bug, because re:run clearly
> didn't give control back to the scheduler during the interval above.
>
> Why does it happen? Is there any option to limit the number of time VM
> spends in re:run?
>
> The program and the data file can be downloaded here:
>
> http://miroslavurbanek.com/re.tgz
>
> Thanks,
> Miro
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
More information about the erlang-bugs
mailing list