[erlang-questions] Erlang and bbcode
Sat Jul 14 15:41:13 CEST 2012
It is far from perfect but Jerome's bbcode scanner is 92 lines and the grammar is 74 lines. The intermediate format can then be used to generate HTML, RTF, or Textile
Sent from my iPhone
On Jul 12, 2012, at 1:37, "Richard O'Keefe" <> wrote:
> On reading the slides about "Erlang sucks" I thought,
> "what is bbcode and how hard can it be to write an
> Erlang parser for it?"
> Since having that thought, I've checked half a dozen
> definitions of bbcode and looked at a parser or two
> and am little the wiser.
> BBcode strikes me as truly bizarre. What is the point
> of entering something that looks pretty much like HTML
> except for using square brackets instead of angle brackets?
> But there is worse.
> - I cannot discover whether any particular character set
> or encoding is presumed and if so which one. (I'd
> *guess* Unicode/UTF-8, but a guess is all it would be.)
> - I cannot discover how you get a plain [ into text.
> Could it be [[? Could it be [?
> - I cannot discover exactly what is a well-formed tag and
> what is not.
> - I cannot discover whether [/*] is legal or not.
> - I cannot discover whether markup is legal inside
> a [url]...[/url] or not (it could be stripped out)
> - Same for [email] and [img] and [youtube] and [gvideo]
> - I cannot discover whether [size=n] takes n in points,
> pixels, percentage of default, or anything else (it
> seems that different systems do different things)
> - I cannot discover whether [youtube] and [gvideo]
> allow width/height like [img] or not.
> - Some descriptions say that :-) is processed as a
> smiley, and that other emoticons may be processed
> too, but I cannot find a list; others say [:-)] is
> a smiley; others say nothing about this.
> - It is not clear how the author of [quote-author]...
> should be rendered; I have a strong suspicion it
> should be locale-dependent.
> - It appears that different instances of bbcode support
> different tag sets out of the box and most of them
> allow some sort of customisation.
> - It appears to be _expected_ that different bbcode
> implementations will translate things differently
> (so [b]xxx[/b] might yield <b> or <strong> or
> <span style="font-weight: bolder;"> or something else),
> which means that it would be hard to make a test suite.
> Indeed, I can find no guarantee that [b] [i] and so on
> won't just be stripped out.
> If the lexical issues could be sorted out, one could easily
> enough write a BBcode -> XML value tree parser, and an
> XML -> XML translator to do things like
> <url default=X>Y</url> -> <a href=X>Y</a>
> <url>Y</url> -> <a href=string_value(Y)>Y</a>
> and then use an existing XML -> text unparser.
> An non-validating XML parser in Erlang took me 275 lines,
> so I doubt bbcode would be much harder.
> erlang-questions mailing list
More information about the erlang-questions