Build an erlang computer (was:Computers are fast)

Joe Armstrong (AL/EAB) joe.armstrong@REDACTED
Thu Jan 26 09:34:25 CET 2006


This is pure lunacy - design goals 10 in
http://www.w3.org/TR/2003/PER-xml-20031030/
says:

     " 10. Terseness in XML markup is of minimal importance. "
 
  But terseness of expression *is* important if you have lots of data,
this implies
that you should not use XML when there is lots of data.

  Using XML for voluminous data is a sure sign of bad design

  << in another project I pumped into, XML was being used to represent
     a quantity that had three discrete states. 

     THREE STATES CAN BE REPRESENTED IN TWO BITS

     But they chose XML - the declaration of a single state look about
     190 Bytes - and they had *lots* of records, which they stored in a
big data base.

     Now the data base was slow, so they bought more memory, it was
still slow,
     so they wanted to go distributed - so they asked me since "Joe
knows something about
     distributed programming" >>

   Mindless use of XML is sure sign of excruciatingly bad design.
   >>

   Idea - grade moderately difficult - XML should compress very nicely -
since the
same tags get repeated over and over again, thus in LZSS compression
duplicated tags will
appear as pointers.

   How about writing an XML parser that works directly from an LSZZ
compressed XML stream
*without* doing the decompression first. If your clever and cache the
result of the parses
of the things that the LZSS pointer point to you might be able to write
a very fast and compact parser.

   I will give a bottle of whisky to the first person to send me a
correct Erlang program that does this.

 Cheers

/Joe


> -----Original Message-----
> From: James Hague [mailto:james.hague@REDACTED] 
> Sent: den 25 januari 2006 19:02
> To: Joe Armstrong (AL/EAB); erlang-questions@REDACTED
> Subject: Re: Build an erlang computer (was:Computers are fast)
> 
> > In my never ending quest for inefficiency I have seldom met 
> a problem 
> > which could not be solved in the twinkling of an eye 
> (apart, that is, 
> > from our friend that wanted to do O(10^19) computations to 
> force some 
> > crypto system :-)
> 
> How about parsing a 300+ megabyte XML file?  I have a lean 
> and mean XML parser in Erlang, one that only deals with a 
> strict subset of XML, and operates entirely on binaries.  On 
> 8 or 10 megabyte files, it's great, but on this monster--heh. 
>  After about 30 minutes the emulator dies with "Abnormal 
> Termination."  I suspect it's running out of memory.
> 
> It's interesting when these kinds of crazy problems come along :)
> 



More information about the erlang-questions mailing list