[erlang-questions] Erlang basic doubts about String, message passing and context switching overhead

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Thu Jan 12 20:56:38 CET 2017


Strings are really just streams of bytes.

Streams of bytes are the ubiquitous data representation format. It can
represent anything you like. The problem is that with the vast generality
comes lack of precision. In order to handle the string, you need to
interpret its contents. You that with a spectrum of string manipulation
techniques: splitting, converting to other types, regular expression
matching, LALR(1) parsing and so on. The goal is to take the stream of
bytes into some deeper structure which you can manipulate in the program.
Squint your eyes enough, and all programs are string processing.

Once you start realizing everything is basically a variant of string
processing, you see why it must be hard to build a good solution for. Your
solution is at some end of the spectrum, which makes it excellent for
handling some problems, but makes it hard to handle others. Very general
solutions to the processing problem (LALR(1) parsers, etc) are not the
easiest to get going on smaller problems.

Hence most processing is complicated because it requires you to pick the
right kind of data processing abstraction in addition to solving the
problem itself.

Natural text is hard because there is very little formal structure in it.
So you have to use approximative methods, Machine Learning, etc. Context
Free text is easier to manage, but few people do it correctly and write a
parser for it. Rather, they tend to try to work directly on the string
(jokingly called "stringly typed programming").

Is Erlang bad at string processing? Yes and no. OCaml processes strings
much faster than Erlang in the common case because it is a compiled
language and can run directly as machine code. OTOH, if you have to look at
every byte in a data stream, chances are you are bound not by the CPU time,
but the time it takes to get data close to the CPU from DRAM. As for the
API availability, it all depends on how good you are at using functional
programming I guess. I rarely feel I lack certain API functions in Erlang
when handling strings. I'd be keen to look into woes you've had in the past
with things that were unwieldy to handle.

On Thu, Jan 12, 2017 at 2:50 AM Steve Davis <steven.charles.davis@REDACTED>
wrote:

>
> You would think that, by now, computer science would have a decent answer
> to how to deal with text.
>
> All answers (from anyplace anyblog anylist) I have heard so far basically…
> suck.
>
> Why is that? What are the missing ingredients to a real solution?
>
> The word “string” for me has long been a dirty word.
>
> /s
>
> (Political aside: *spits on all that “emoji” crap* :-) )
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170112/860f5be3/attachment.htm>


More information about the erlang-questions mailing list