[erlang-bugs] Freeze in re:run

PAILLEAU Eric eric.pailleau@REDACTED
Sat May 23 00:03:06 CEST 2015


Hello,
Using your code, I got same behavior, ie. 22 secondes then freeze.

Using a pre-compiled regexp , I got freeze only 2/3 secondes after start .

-define(MP,{re_pattern,1,0,0,<<69,82,67,80,73,1,0,0,7,2,0,0,81,8,0,0,255,255,255,255,255,255,255,255,45,0,10,0,1,0,1,0,0,0,56,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,125,1,13,121,0,8,118,0,1,30,10,114,0,8,30,45,30,45,125,0,133,86,13,26,30,99,30,111,30,110,30,116,30,101,30,110,30,116,30,45,30,100,30,105,30,115,30,112,30,111,30,115,30,105,30,116,30,105,30,111,30,110,30,58,141,125,0,13,120,0,6,30,10,8,114,0,6,13,116,0,13,30,102,30,105,30,108,30,101,30,110,30,97,30,109,30,101,30,61,30,34,73,34,30,46,127,0,39,0,1,107,255,255,255,255,251,191,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,99,114,0,39,30,34,86,13,113,0,109,86,13,26,30,99,30,111,30,110,30,116,30,101,30,110,30,116,30,45,30,116,30,121,30,112,30,101,30,58,141,125,0,13,120,0,6,30,10,8,114,0,6,13,116,0,13,30,110,30,97,30,109,30,101,30,61,30,34,73,34,30,46,127,0,39,0,1,107,255,255,255,255,251,191,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255
,255,255,255,255,255,255,99,114,0,39,30,34,114,0,242,86,13,30,10,30,10,114,1,13,0>>}).

and
re:run(Data, ?MP, [report_errors]).

$ ./test2.erl test.eml
21:53:26 heartbeat
21:53:27 heartbeat
21:53:28 heartbeat
21:54:17 heartbeat
21:54:17 nomatch

Using strace :

_____________________________________________________
[...]
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
futex(0xb6840088, FUTEX_WAIT_PRIVATE, -1, NULL) = 0
rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
write(7, "\0", 1)                       = 1
select(0, NULL, NULL, NULL, NULL21:59:14 heartbeat
21:59:15 heartbeat
21:59:16 heartbeat
22:00:05 nomatch
22:00:05 heartbeat
  <unfinished ...>
+++ exited with 0 +++
_____________________________________________________

sched_yield() is not called during the 50 secondes processing.

Indeed strange.
Regards



Le 22/05/2015 17:43, Éric Pailleau a écrit :
> Hi,
>
> Your regexp is compiled at each call.
> Your should compile it in your shell and set it in your code by a define.
> It is probably not your problem, but it should increase perfs...
> Regards
>
> Le 22 mai 2015 00:42, Miroslav Urbanek <mu@REDACTED> a écrit :
>>
>> Hi,
>>
>> I'm using Erlang 17.5 and I'm experiencing freezes in re:run that take
>> tens of seconds. I've isolated the problematic part into the following
>> code:
>>
>> ----
>> #!/usr/bin/env escript
>>
>> log(Term) ->
>>      {_ , {Hour, Minute, Second}} = calendar:now_to_universal_time(os:timestamp()),
>>      io:format("~2..0w:~2..0w:~2..0w ~p~n", [Hour, Minute, Second, Term]).
>>
>> test(Filename) ->
>>      {ok, IoDevice} = file:open(Filename, [binary, read]),
>>      {ok, Data} = file:read(IoDevice, 1048576),
>>      Regexp = "(?i)(?m)(?s)(?U)(?<=\n)--(?|.*^content-disposition:(?:(?!\n\\S).)*filename=\"[^\"]*\\.([^\".]*)\".*|.*^content-type:(?:(?!\n\\S).)*name=\"[^\"]*\\.([^\".]*)\").*\n\n",
>>      re:run(Data, Regexp, [report_errors]).
>>
>> main([Filename]) ->
>>      spawn_link(fun Fun() ->
>>                         log(heartbeat),
>>                         timer:sleep(1000),
>>                         Fun()
>>                 end),
>>      log(test(Filename)).
>> ----
>>
>> Processing the complex regexp above takes about 50 seconds for my test
>> data. Sometimes, the whole Erlang VM freezes for several seconds. The
>> output then looks like this:
>>
>> ----
>> $ ./test.erl test.eml
>> 18:27:48 heartbeat
>> 18:27:49 heartbeat
>> 18:27:50 heartbeat
>> 18:27:51 heartbeat
>> 18:27:52 heartbeat
>> 18:27:53 heartbeat
>> 18:27:54 heartbeat
>> 18:27:55 heartbeat
>> 18:27:56 heartbeat
>> 18:27:57 heartbeat
>> 18:27:58 heartbeat
>> 18:27:59 heartbeat
>> 18:28:00 heartbeat
>> 18:28:01 heartbeat
>> 18:28:02 heartbeat
>> 18:28:03 heartbeat
>> 18:28:04 heartbeat
>> 18:28:05 heartbeat
>> 18:28:06 heartbeat
>> 18:28:07 heartbeat
>> 18:28:08 heartbeat
>> 18:28:36 heartbeat
>> 18:28:36 nomatch
>> ----
>>
>> The VM froze between 18:28:08 and 18:28:36. The manual page re(3)
>> states that "re:run always give control back to the scheduler of
>> Erlang processes at intervals that ensures the real time properties of
>> the Erlang system". I believe this is a bug, because re:run clearly
>> didn't give control back to the scheduler during the interval above.
>>
>> Why does it happen? Is there any option to limit the number of time VM
>> spends in re:run?
>>
>> The program and the data file can be downloaded here:
>>
>> http://miroslavurbanek.com/re.tgz
>>
>> Thanks,
>> Miro
>> _______________________________________________
>> erlang-bugs mailing list
>> erlang-bugs@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-bugs
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs
>




More information about the erlang-bugs mailing list