[erlang-questions] Which is best? string:concat or ++?

Richard O'Keefe ok@REDACTED
Tue May 8 00:25:50 CEST 2012


On 8/05/2012, at 1:31 AM, Paul Barry wrote:

> Hi folks.
> 
> This might be a case of a dumb question, but here goes anyway.  :-)
> 
> I have two strings, A and B.  Is it better to use:
> 
>   string:concat(A, B)
> 
> or
> 
>    A ++ B
> 
> when combining them to create a new string?

Erlang is open source.  You can read the source code of library modules,
and it's often instructive to do so.  In string.erl you find

concat(S1, S2) -> S1 ++ S2.

There is probably no measurable difference.

>  I'm writing some code
> that generates a chunk of HTML.  I know that using ++ on big lists is
> regarded as a "no-no", but is it acceptable here?

I'm not aware of any guideline that says using ++ on big lists is
a no-no.  If for some reason you *need* the concatenation of two
big lists as a list, ++ is the very best way to do it.

The thing that people are warned against is that
   (((A ++ B) ++ C) ++ D) ++ E
is more expensive than
   A ++ (B ++ (C ++ (D ++ E)))
because the former copies A 4 times, B 3 times, C 2 times,
while the latter copies them each only once.  (This is true in
Lisp, Prolog, Erlang, Haskell, Clean, SML, F#, ... or even in
C if you do your own one-way linked lists there.)

You may be asking the wrong question.
The right question is probably "Is it a good idea to generate
HTML as a string in any programming language?" to which the
answer is "only in a language that does not let you build
trees."

One way to represent SGML data, including HTML, in Erlang
looks like
	{Element_Name, [{Attribute_Name,Value}...], [Child...]}
or	<<text as a binary>>
or	"text as a list"

in which while generating the tree you never ever have to worry
about escaping any data.  You write a function that walks over
a tree like this, perhaps sending it to a port, or perhaps
creating an IO list, and *that* function does whatever escaping
is necessary.

	{p,[],["This <is> safe &so; is <this>!"]}

is perfectly safe, as long as your output function knows what it
is doing.  Also, if you know you are generating HTML, you could
have your output function take care of omitting end tags for
empty-by-definition elements.

You wouldn't believe how easy it is to manipulate SGML data this
way until you've tried it.

(I _could_ have pointed to xmerl, but that's rather more complicated.)



More information about the erlang-questions mailing list