bug or feature?
Richard A. O'Keefe
ok@REDACTED
Thu Apr 20 05:29:34 CEST 2006
I wrote a short clear solution to the "remove a prefix if present otherwise
do nothing" problem that uses only existing features and functions.
Serge Aleynikov <serge@REDACTED> wrote:
In a general case you are absolutely right as the complexity of the
proposed solution is still linear. Would anyone care that the other
approach would work 2 to 3 times faster?
Which other approach? The message I was replying to had two.
When it comes to soft-real-time applications with extensive
prefix analysis, I would.
If you have EXTENSIVE prefix analysis, ALL the solutions on the table
must count as inefficient. This is actually a wonderful example of what
I have come to call "the PERL effect". The language offers you a feature
that makes it simple, straightforward, incredibly easy ... to do the wrong
thing.
What do I mean in this case? All three of the solutions (the two that work
and the one that isn't (yet) legal syntax) require O(|Prefix|) time to check
for a prefix. If we have N prefixes, an average of P characters long, the
time to check them all is O(P.N). Using a binary search tree takes this
down to O(P.lg N). Using a trie instead takes the time down to O(P). And
the key trick is to see the problem as doing MANY prefix checks, not as
doing one. The difference between a built-in reversible ++ and one you
program hardly matters compared with an O(N) speedup.
Of course, if "extensive prefix analysis" means analysing many strings
using a small number of prefixes, it's easy enough to generate an
Erlang function (checking for those particular prefixes) *as a data structure*
and cause it to be compiled, so that even in that case the main result from
the generalised "feature" would be to prevent you thinking of a better
solution.
A more rhetorical question would be that if '++' is a "syntactic
sugar" why wouldn't that syntactic sugar apply for bound Prefix
variables as well to make it more generic?
The term "syntactic sugar" refers to a construct which can be simply and
mechanically replaced by some other construct(s) in the language and
exists only to make the language more convenient.
The pattern
"c1...cn" ++ Pat
is justly called "syntactic sugar" because it is syntactic sugar *FOR*
something, namely
[$c1|[$c2|...[$cn|Pat]...]]
with a KNOWN number of pairs. The pattern
Var ++ Pat
should it ever exist, would NOT be syntactic sugar at all; there is nothing
that could replace it, no combination of existing pattern notation could be
made to do the same thing.
To say that "Var ++ Pat" would not be syntactic sugar is not to say that it
would not be a useful extension but only to say that it WOULD be a non-trivial
extension of the language. You couldn't hack it just be a small change to
the front end of the compiler, as literal++Pat was. Personally, I believe
that it would NOT be a useful extension to the language; it is just too
specialised, and the problems it is good for have better solutions.
More information about the erlang-questions
mailing list