(Prolog + LISP + Erlang) with integration issues versus C++

Richard A. O'Keefe ok@REDACTED
Wed Sep 7 02:15:14 CEST 2005

Let me share with you something I discovered last night.
It had me roaring with laughter rather sadly.

Apparently one of the managers' arguments against Erlang &c was
"if C/C++ are good enough for Google, they're good enough for us."

Someone mentioned that Rob Pike is now at Google, and provided his URL:

I wondered, if Google hired an expert on operating system design (Plan 9)
and programming languages, what is he doing for them?  Answer:
    "He works on distributed systems, data mining, programming languages,
    and software development tools."               ^^^^^^^^^^^^^^^^^^^^^

So I started reading his papers.  The relevant one is

	Interpreting the Data: Parallel Analysis with Sawzall
	by Pike, Dorward, Griesemer, and Quinlan.

The paper describes an *interpreted* "typed-AWK" language called
Sawzall.  The context goes like this:

    Google have a Data Definition Language (not described in that
    paper, but apparently roughly the same kind of thing as ASN.1).
    "Although originally conceived for defining the messages
    communicated between servers, Google's <i>protocol buffers</i>
    are also used to describe the format of permanent records
    stored on disk.  They serve a purpose similar to XML, but have
    a much denser, binary, represesentation, and an external data
    description language (DDL) that a <i>protocol compiler</i> can
    translate into support code for a wide variety of languages."

This is the approach I recommended for gluing Lisp/Prolog/Erlang
components together:  invent and implement, or acquire, a common
data language (such as UBF or the rather Prolog-like one from CWI
or even ASN.1) and use an interface generator (= protocol compiler)
to generate the reading/writing code, so that if you want to add a
C or C++ component, you just add an interface generator for them.

To start with, them, this means that the bulk of the I/O code in
Google programs (other than actual page serving) is *not* written
in C or C++, nor is it *hand* written in "a wide variety of
languages", it is written once in DDL and shared by "a wide variety
of languages".

Now this fact alone tells us that Google *use* "a wide variety of languages".

Continuing with the context for Sawzall:

    "The data sets are often stored in ... the Google File System [which]
    provides a reliable distributed storage system that can grow to
    petabyte scale.  ...

    MapReduce is a library for applications that run on the [dsitributed
    load balanced cluster platform] [providing parallelism, distribution,
    fault tolerance, moving computations close to the data, &c]. ..."

This is the kind of "put the good stuff in a library" which OTP basically
is, and which the managers were suggesting should be written.
Clearly, you can do that, and if you want to know how much work it is,
look at OTP, count the lines, and do the COCOMO II or FPA calculations
for yourself.  GFS, Workqueue, and MapReduce provide a native C/C++
interface.  It is possible to write C++ code to do massively parallel
calculations on data, and sometimes Google do that.

    The basic kind of calculation that Google do with this involves
    *independently* processing each record (the kind of computation
    that's called "embarassingly parallel") followed by combining
    the results with commutative associative operations (like sum,
    filter, maximum, top N, &c).

These calculations are mostly straightforward, but they operate on
*amazing* volumes of data and may run for days even on a thousand CPUs.
They have got to be right, and they have got to be fast.

    "The query language, Sawzall, operates at about the level of a
    type-safe scripting language.  FOR PROBLEMS THAT CAN BE SOLVED
    a factor of ten or more - than the corresponding C++ code in MapReduce.
    Overall, Sawzall programs tend to be around 10 to 20 times shorter
    than the equivalent MapReduce programs in C++ and significantly
    easier to write."

Sawzall programs pick up the type of the data they are processing from
the DDL that describes it.

Here is a sample Sawzall program from the paper.
Its task is to find "for each web domain, which page in that domain
has the highest page rank?"

    proto "document.proto"
    max_pagerank_url: table maximum (1) [domain: string]
                         of url: string weight pagerank: int;
    doc: Document = input;
    url: string = doc.url;
    emit max_pagerank_url[domain(url)] <- url weight doc.pagerank;

That's it, complete.

What about performance?

    "Sawzall is about 1.6 times slower than interpreted Java,
    21 times slower than compiled Jva, and 51 times slower than
    compiled C++."

Ooh dear, 51 times slower than C++?  Surely such dreadful performance
must mean that it's completely useless.  After all, that's basically
what the managers were telling Dev Functional, not that
they actually had any _evidence_ that the performance would be worse.

Well, no, actually.

	"Although it has been deployed only for about 18 months,

You know, Google sounds like a really exciting place to be.  It sounds
as though they are not in the least stuffy, not afraid to try anything
new.  How shall I put it?  They sound like ENGINEERS!

Yes, but what about performance?

	"One measure of Sawzall's utility is how much data processing
	it does.  We monitored its use during the month of March 2005.
	During that time, on one dedicated ... cluster with 1500 Xeon
	CPUs, there were 32,580 Sawzall jobs launched, using an average
	of 220 machines each.  While running those jobs, 18,636 failures
	occurred (application failure, network outage, system crash, &c)
	that triggered rerunning some portion of the job.  The jobs
	read a total of 3.2x10**15 bytes of data (2.8PB) and wrote
	9.9x10**12 bytes (9.3TB) (demonstrating that the term "data
	reduction" has some resonance).  The average job therefore
	processed about 100GB.  The jobs collectively consumed almost
	exactly one machine-century."
So Google are shoving an almost unimaginable amount of data through
this language.  What's the secret?  There is so *much* data that the
jobs are I/O bound, making the Sawzall code 51 times faster would make
no difference whatever to over-all system performance.

So the Google lesson is the direct OPPOSITE of what the managers said:

    Google use "a wide variety of languages", not just C/C++.
    (In addition to Sawzall, Python is repeatedly mentioned.)

    Google have no hesitation in not only adopting a problem-specific
    language but inventing one.

    Using a mix of languages (C++ in the support libraries, DDL for
    the on-disc data structures, Sawzall for data reduction, sometimes
    SQL for final analysis of the reduced data) not only doesn't
    increase system or maintenance costs, it actually reduces them.

    In an I/O bound application, fifty times slower than C++ for the
    computational part may be more than fast enough.

    In a very large system, failures are common (1 for every 2 jobs,
    or 600+ per day) and automatic restart of affected computations
    can be a big help (think Erlang supervision trees).

More information about the erlang-questions mailing list