Dynamic languages are the future

ke han ke.han@REDACTED
Thu Aug 31 08:03:07 CEST 2006


 From 1998 to 2002 my company built an aggressive app framework in  
Java and a vertical app to leverage it.  We ended up with about  
500,000 lines of code.  Over half of this was replacing what should  
have been good enough in the core Java libraries.
By the time we were done, we rewrote just about every core Java class  
as the JDK equivalent was poorly designed/implemented.  We had to  
rewrite serialization, make thread safe collections and thread safe  
just about everything, rolled our own non-blocking io which after Sun  
came out with nio, we found ours was still much more time and space  
efficient....We ended up with something that showed Java as a  
language can be very efficient and scalable...but the libraries...not  
so much ;-)
ke han


On Aug 31, 2006, at 12:11 PM, Richard A. O'Keefe wrote:

> fbg111 <fbg111@REDACTED> wrote:
> 	Same problem
> 	http://www.joelonsoftware.com/articles/ThePerilsofJavaSchools.html
> 	in America.  More
> 	http://paulgraham.com/popular.html
> 	good writings  on that topic at Paul Graham's site.
>
> It must be synchronicity.
> We've had a thread about strings that looks like reviving.
> I'm in the middle of writing some Java classes for a 4th year  
> student to use.
> And now here's mention of Java.
>
> The 4th year student is supposed to be investigating a topic in  
> information
> retrieval; I wanted to see if multicore systems could do IR faster,  
> and
> Andrew Trotman came up with the key concept for how to structure an  
> index
> so that this might actually work.
>
> The student was given a tiny (< 500 line) IR engine in C that reads  
> an XML
> document collection and builds an index (~ 300 lines) and that reads
> an index and queries (~ 300 lines; the two programs share some code).
> It's small, dead simple, and reasonably fast.  It can index the test
> collection in under 3 minutes.  The student basically got nowhere  
> modifying
> it because despite having had C in 3rd year, she only really knows  
> Java.
> Her Java rewrite of my C code (without the XML stuff; the document
> collection had to be reformatted) takes eight hours.
>
> Eight hours!  That's 160 times slower than C!
>
> Just for grins, I rewrote the index builder in AWK.  44 lines of AWK.
> (It works on the same reformatted document collection as the student's
> code, uses built in hash tables, and writes numbers in ASCII, not  
> binary.)
> Her Java program was more than 50 times slower than AWK.
>
> Profiling to the rescue:  "java -Xprof BuildIndex wsj.data".
> It turned out that practically all the time was going in  
> RandomAccessFile.
> Guess what:  RandomAccessFile doesn't do any buffering, so each
>
>     f.writeInt(x)
>
> turned into 4 calls to f.write((byte)(x >> ...)), and each of *those*
> is a call to a native method, involving a switch from Java to C and  
> back.
>
> Just adding a few lines of code to buffer stuff into a byte array and
> flushing that every so often (just like using fwrite() in C would...)
> speeded the program up by a factor of 10.
>
> The mawk version is still sniggering at the Java version, but not  
> as loudly.
>
> Of course this says nothing about Java AS A LANGUAGE.
> It's a library issue.  But in real Java, practically _everything_ is
> a library issue.  There's another student doing a GA+IR project whose
> program was speeded up by a large factor by another lecturer.  Same
> thing:  run the program with -Xprof, spot that the time is going in a
> library class (ArrayList, as it happens), rewrite to use plain arrays,
> time goes way down.
>
> One of the things that makes Erlang a practical language for real  
> applications
> is the tools for working with Erlang, like 'eprof' and 'cover'.
>




More information about the erlang-questions mailing list