[erlang-questions] clarify: why bit syntax so slow + benchmark code

Anders Nygren anders.nygren@REDACTED
Sat Nov 17 15:45:57 CET 2007


On Nov 17, 2007 5:08 AM, Mateusz Berezecki <mateuszb@REDACTED> wrote:
> On Nov 17, 2007, at 11:22 AM, Per Gustafsson wrote:
>
> > The way to write that would be:
> >
> >
> > parse_stream(<<1,some parsing pattern,Rest/binary>>) ->
> > ...
> > parse_stream(Rest);
> > parse_stream(<<2,some other parsing pattern,Rest/binary>>) ->
> > ...
> > parse_stream(Rest);
> > ...
> > parse_stream(<<>>) -> ok.
> >
> > This will minimize he construction of unnecessary sub-binaries and
> > in R12B you should be able to get good performance for this kind of
> > approach (It still wouldn't be as fast as C, but a lot closer).
> >
> > If you want some more information about the implementation of the
> > bit syntax you should read the paper I presented at this years EUC:
> >
> > http://www.erlang.se/euc/07/papers/1700Gustafsson.pdf
>
> Per thanks for the URL. I will read it this weekend.
>
> Thomas I've read the widefinder discussion and it is completely
> not applicable to this kind of problem I am having.
>
> widefinder is for disk IO, log related files. I'm talking
> variable length control structures extracted on the fly
> from the huge volume network stream. I can't "parallelize"
> the stream of which I know nothing of except I know
> the first byte describes some small excerpt of it.
>

True, but there were also a lot of discussions about how to handle
binaries efficiently.

For a good summary see, Caoyuan's  blog

http://blogtrader.net/page/dcaoyuan/entry/learning_coding_binary_was_tim

That applies to R11.
Things will change in R12, as described in Per Gustafsson's document.
To get some more hints on efficient use of binaries in R12, see the efficiency
guide in the R12 documentation, preliminary versions can be found in
http://erlang.org/download/snapshots/

/Anders

> In explicit I'm talking this kind of stuff
>
> <<LenLen:2/unsigned-integer, NameLen:3/unsigned-integer, _CB:1, 0:1,_:1,
> Rest/binary>> = Bin,
>
> LenLen1 = LenLen * 8,
> NameLen1 = NameLen + 1,
>
> << LengthPay:LenLen1/unsigned-integer,
> NameBin:NameLen1/binary,
> Payload:LengthPay/binary,
> Rest2/binary >> = Rest,
>
>
> Is there any way to put this in one line and preferably
> in the function header so as to avoid allocating stuff?
> Why arithmetic expressions are not allowed in bit syntax?
>
> The function returns
>
> {NameBin, Payload, Rest}
>
> but it is recursive, doing extract(Stream)->extract(Rest)-
>  >extract(RestOfRest)
> until it parses out a complete excerpt which is usually less than 300
> bytes.
>
> After parsing it, it then it proceeds to extracting another fragment
> of data from the stream.
>
> Is this kind of problem suitable for doing in erlang or should I go
> with linked-in C driver?
>
>
> regards,
> Mateusz Berezecki
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>



More information about the erlang-questions mailing list