[erlang-questions] rpc is bad? (was Re: facebook chat server)

Steve Vinoski vinoski@REDACTED
Sat May 24 00:24:48 CEST 2008


On 5/23/08, Raoul Duke <raould@REDACTED> wrote:
> > from it a language-independent distributed system that will avoid all
>  > the impedance mismatch problems and distributed computing fallacies.
>  > The people who push such tools and approaches ought to know better as
>  > well.
>
> Steve, your experience and comments are very interesting - you are
>  doing us all a service with your articles, thank you :)

You're welcome, but thanks to you guys too -- this overall
conversation will likely serve as the basis for my next column. :-)

>  Presumably some people manage to mostly get away with using web
>  services, or WSDL (i know i've been required to set up such a system
>  in previous jobs) and i guess apparently don't hit some eventual "holy
>  cow the universe is imploding how did this ever work" kid of
>  enlightenment moment. Why is that, or when will it happen, or what is
>  the prerequisite?

That's an interesting question. The situation is, I think, similar to
Paul Graham's "Blub Paradox" <http://www.paulgraham.com/avg.html>
where someone using a hypothetical average programming language named
Blub never even realizes there are better languages to use instead
because they think only in Blub. If all you know is generating WSDL
from a heavily-annotated Java class in Eclipse, you might not ever
consider there are better ways of getting the job done.

>  I guess in other words, if I were to try to convince somebody else
>  that the tricks above won't really work, what sorta-concrete-ish use
>  cases would be good examples?

WSDL, IDL and all that sort of stuff is made for cross-language,
cross-system integration. For me, therefore, the concrete cases
involve business changes requiring new integrations, such as mergers,
acquisitions, reorganizations, partnerships, new 3rd-party systems or
even just new versions of 3rd-party systems, etc. When faced with such
situations, which let's face it aren't all that uncommon, someone who
"designed" their system in a vacuum and generated WSDL and such from
their Java or C# code will be unhappy when they realize their
supposedly standard Web services stuff doesn't actually work with the
Web services stuff the other guy is using.

One of the most effective forms of enterprise integration I've seen
over the years is publish/subscribe messaging. I worked many years for
CORBA vendors, and we'd often lose potential deals to messaging
systems. Message queuing systems work well because (in no particular
order):

* they don't pretend to be programming language procedure or method
calls, so they avoid the associated impedance mismatch problems
* they don't try to hide distributed systems issues
* coupling is low -- drop a message into a queue here, pick up a
message from a queue there
* queues can be persistent, or more generally, delivery guarantees can
be varied as needed
* asynchrony
* payloads need not conform to some made-up IDL type system
* getting two different messaging systems to interoperate is easier
than getting two different RPC or distributed object systems to
interoperate

The problem with messaging systems, though, is that traditionally
they've been quite expensive. Thankfully, I believe AMQP
(<http://amqp.org/>) solves that issue nicely, and of course Erlang is
the perfect way to implement it, which Alexis, Tony, and the rest of
the RabbitMQ guys have already done (<http://www.rabbitmq.com/>).

>  And, to tie it back to Erlang, do you think that Erlang helps in some
>  way? (Presumably because it doesn't give you a mode where you think
>  things are instantaneous+synchronous by default.)

Let's say we view an enterprise system or even the web as an
integrated set of cells, where cells are ideally as loosely coupled as
possible. Within a cell, we can hopefully use whatever language or
technology we want in order to provide or consume services to/from
other cells with no fear of our technology or implementation choices
leaking across to other cells. I used the term "cell" rather than
"host" here, BTW, because distribution may be, and in fact most likely
is, employed not only between cells but also within cells. For
example, consider Amazon.com's S3 service as a cell; it consists of
large server farms in the back end which interoperate and communicate
using whatever software, systems, and protocols individual teams
within Amazon.com might choose to use, but users access the service
from their own cells via HTTP. For intra-enterprise non-web systems,
this concept also holds, because you might have, for example, a
replicated J2EE service running on a set of clustered hosts for
availability and reliability, talking to a replicated database in
another tier. Large enterprises would consist of many such cells, most
of them visible only internally on their own intranets.

In such situations, Erlang helps in at least two ways:

1. Implementing cells using Erlang is typically much easier than
implementing them with Java or C++ or C# or whatever. Not only is the
language simpler, but because Erlang/OTP already has many of the hard
parts of distribution and concurrency effectively built in, it allows
cells to be developed and implemented in less time with less code and
at lower cost. For heterogeneous cells, Erlang ports and FFIs make it
possible to join Erlang systems with other existing systems within the
same cell, too, such as Jinterface for Java systems.

2. Cells presumably integrate across fairly standard protocols, such
as HTTP, IIOP, or AMQP. Erlang is superior here too -- it already
supports a variety of standard protocols, and it makes writing new
clients and servers for other such protocols relatively easy.
Furthermore, it makes it easier and less expensive to write the
systems that sit at the edges of cells and integrate with other cells
such that those systems are scalable, reliable, and highly-available.

hope this helps,
--steve



More information about the erlang-questions mailing list