[erlang-questions] rpc is bad? (was Re: facebook chat server)

Steve Vinoski vinoski@REDACTED
Sat May 24 23:37:01 CEST 2008


On 5/23/08, Ben Hood <0x6e6562@REDACTED> wrote:
> Steve,
>
>  Great post!

Thanks. :-)

>  On 23 May 2008, at 23:24, Steve Vinoski wrote:
>
>  > Message queuing systems work well because (in no particular
>  > order):
> .....
>
>  > * payloads need not conform to some made-up IDL type system
>  > * getting two different messaging systems to interoperate is easier
>  > than getting two different RPC or distributed object systems to
>  > interoperate
>
> Just out of interest's sake, in your experience, is there a *right*
>  way to interpret the payload of the message?
>
>  With IDL/WSDL you know ahead of time how to decode the payload, if you
>  defer it to a *dynamic* approach, IMHO you have to some kind of a
>  priori knowledge of the structure of the data and have to be able to
>  create instances of those data structures on the fly in your target
>  language.

Some sort of prior knowledge of what you're receiving is always
required, but such knowledge can contain many levels of indirection.
For example, you might dynamically load on demand the code necessary
to interpret the payload, perhaps based on a received key or something
like that, and so approaches can be extremely dynamic if desired. It
all depends on what you're trying to achieve, of course.

In IDL-based systems, the IDL language typically defines primitive
data types that can be combined into user-defined types like structs.
The system's protocol defines how these data types, both primitive and
user-defined, are to be written to and read from messages. As long as
both sides of the wire agree on the IDL types, the infrastructure
takes care of proper payload marshaling, and all is well.

In REST, payload types are indicated by media types. With the HTTP
flavor of REST, MIME types serve as media types. As long as sender and
receiver can agree on the MIME types for the messages they exchange,
and assuming both sides can properly encode and decode messages
containing those types, all is well.

In the first example there usually isn't any type information or extra
information of any sort encoded into the messages; receivers are
assumed to know what to expect, and it's assumed both sender and
receiver contain equivalent and compatible code artifacts, typically
auto-generated from the same IDL specifications, for interpreting
message payloads.

In the second example, sender and receiver agree on the MIME type,
thereby agreeing on the payload definition. MIME types are globally
known, so you can build totally independent applications that can
later exchange messages based on those MIME types. The payload itself
might contain extra information to help with decoding or it might not;
it all depends on the MIME type definition, to which both sender and
receiver are expected to conform. Coupling here is much less than with
the first approach, as I explained in this recent article:

<http://computer.org/portal/pages/dsonline/2008/04/w2tow.xml>

These are but two examples, but many variations on these themes and
others have been employed in many different distributed systems.

As for data representation within the application, that's a whole
different issue, and it gets into the impedance mismatch problems I
talked about. Take XML or JSON, for example. Normally sent over the
wire as text, they are represented in different programming languages
differently, each representation preferably making the data as natural
as possible to use within that language. A JavaScript object one on
side of the wire already *is* JSON, for example, with zero impedance
mismatch, but when sent to Erlang, it typically becomes a mixture of
tuples, lists, atoms, and strings, with some degree of impedance
mismatch.

CORBA is one example of the IDL-based systems described above, so it
expects the receiver to know what to expect in each message, but it
does contain one interesting data type as far as representation goes:
the Any type. An Any is a (type, value) pair. The type is represented
by a TypeCode, which is essentially a pass-by-value CORBA object. The
value is some sort of language representation of an IDL type. When
marshaled, both type and value are included. The receiver knows only
to expect an Any because that's all the IDL specifies, but that
receiver might not have compile-time knowledge of the type the Any
contains. For example, an event service has to deal with event data
represented as Any; any event producer application can connect to the
service and deliver whatever data inside its Anys that it wants to,
and the event service has to be able to receive those events and then
send them back out to any interested subscribers. (Obviously, the
event service can't be recompiled every time someone invents a new IDL
type.) This gets real interesting in a language like C++ -- how do you
represent a type like a struct for which you have no compile-time
knowledge? One answer is that you represent it as a list of the
primitive IDL types that make up the struct, for which your
application will always have compile-time knowledge. The Any's
TypeCode can be traversed to determine all such primitives. Another
approach is to keep the Any data in marshaled form until it reaches a
point that knows how to unmarshal it. Implementing all that support in
a way that's reasonably efficient can be fun and challenging, but it's
an example that shows there could be many different representations
for the same data type, depending on context, even within just one
language.

>  I ask this question because I took the following approach ( http://hopper.squarespace.com/blog/2008/5/22/pet-store-part-1.html
>   ) to the problem you are talking about, but am just questioning
>  whether my own approach has any merit at all.

I glanced over it and it looks fine to me, though I did see the word
"RPC" used there a few times ;-). You're using AMQP and RabbitMQ,
which is good. Ultimately, you have to choose some form for your
message payloads, and you chose Cotton; while I've never heard of
Cotton before and so don't know any details about it, it seems to fit
your application well.

--steve



More information about the erlang-questions mailing list