bug or feature?
Richard A. O'Keefe
Mon Apr 24 00:21:37 CEST 2006
> Of course, if "extensive prefix analysis" means analysing many strings
> using a small number of prefixes, it's easy enough to generate an
> Erlang function (checking for those particular prefixes) *as a data structure*
> and cause it to be compiled, so that even in that case the main result from
> the generalised "feature" would be to prevent you thinking of a better
Serge Aleynikov <serge@REDACTED> replied:
Actually this is exactly the problem and the approach we took that
triggered this email thread. Given a relatively small number of
prefixes (unknown at compile time) and analysing many strings, we tried
to generate a function doing a pattern match of a string against each
prefix. It looked like having a fun(Prefix ++ Suffix, Prefix) notation
would be a convenient representation for this task, when we realised it
was an illegal expression.
No, from the sound of it, the approach you tried is NOT "exactly the
approach" that I'm talking about but very different indeed. The approach
I am talking about would NOT use fun(Prefix++Suffix, Prefix) -- in which
I for one would expect the first occurrence of Prefix to count as NOT BOUND.
Let's take an example. Suppose the prefixes are
Here's the function:
[C1|S1] = S0,
of $h ->
[C2|S2] = S1,
of $t ->
[C3|S3] = S2,
of $t ->
[C4|S4] = S3,
of $p ->
[C5|S5] = S4,
of $: -> /* Suffix is */ S5
; $s ->
[C6|S6] = S5,
of $: -> /* Suffix is */ S6
; _ -> S0
; _ -> S0
oh the heck with it, you get the idea. It's a trie. It's not the kind of
code you want to write by hand (which is why I didn't want to finish it),
but it's the kind of code that is very easy to generate by computer.
It's fast: each character of the input is examined at most once.
It generates no intermediate data structures.
And it has absolutely NO use for Prefix++Suffix in any form anywhere.
Ideally, if you wrote
f("http:" ++ S) -> S;
f("https:" ++ S) -> S;
f("ftp:" ++ S) -> S;
f("tftp:" ++ S) -> S;
f( S) -> S.
the compiler would generate essentially the same trie. Certainly that's
what I would expect a Haskell or SML compiler to do. I don't know what
the current Erlang compiler will do with it, which is why I mention the
possibility of generating your own trie code. It's really easy to do,
once you have got your head around how to represent Erlang code as data
More information about the erlang-questions