p2p summary (kind of)

Joe Armstrong joe@REDACTED
Thu Feb 14 11:00:29 CET 2002

> Hello everyone!
> After turning inside out the peer2peer concept for the last days, I reached
> some conclusions that might (some of them at least) be of interest for you.
> The ones that are just random greek ;-) just ignore them. The text is also
> available on the Wiki.
> - there are many protocols on the making, but none was really an "aha!"
> experience, except maybe Chord, recommended by Joe.
> - I don't have the time to build/implement a low-level protocol, even if
> that would give immediate reward by being able to connect to an already
> existing network

<ranting mode on>

No the existing ones are crap :-) - Gutella is like wandering into a random 
room in a random town in a random country and shouting at a random
group of passers by "has anybody got a fish"

I've been thinking about the "discovery protocol" for some time - and
the answer is ..... "Dewey" 

Assume we have producers and consumers of information
we need to discover who knows what.

If I have a car for sale the best way to sell it is to place an ad in the 
local paper in the "Pets section" - the ad might read

	Banana for sale
	one careful owner

This is how the internet works - crap - then we invent Google (a miscropic 

If we have a  Key -> Value namespace (which is what the producer/consumer 
paradigm is) then we must agree on the Keys and what they mean.

"car" Means that tin thing that costs a lot spews out poisinous gases 
despoilse the environment and slows down personal transportation between
A and B.

"banana" means yellow thing whose skin you slip on.

So how do we agree on Keys (use the dewey decimal system, or library of 
congress catelogging system - librariens have thought about this for yonks).

How do you find info in the p2p system - by *carefully* probing the peer 
group and asking the right questions - by building catalogues and maintaining 
trust neworks (do I believe this information)

JXTR says "Info is keyed by ten zillion bit (random) globally unique 

Now I can't even say "banana for sale" when I want to sell my car

Now I must say

	328fgfjghsfio37465o8ybamfo82347b3245325kjf741hkht for sale
             apply to 3985728356hfjahecuyrileuayct97kfkurfyselihfilaerh for   
             more details

And to make it technically respectable I'd do it XML and program it in Java
But that *basicaly* is what JXTR is (at the most primitive level) - with not 
even a hint as to HOW you go about finding the meading of the GUIDs.
(I might even specify it in UML and hire fifty consultants for good measure 

	No my infrastruce will use the dewey decimal system - and I will
keep data bases of who knows what and how reliable the information is

</ranting mode on>

That feels better :-)

> - Erlang offers "for free" much of the basic functionality that is needed
> to create a p2p network, through the distribution mechanism. However,
> Erlang's actual design probably doesn't scale up and if trying that it
> might end up just as Gnutella did...

Erlang is a programing language - PLs don't scale up!

> - What I am mostly interested is a general framework that will permit p2p
> applications to be built upon. This means that the basic services are to be
> at least: connectivity, routing and gatewaying, security, search (of any
> kind: for other peers, for data, for services/applications). The
> applications should be just plugins that use the connection provided, using
> their own protocol.

I've covered search (or resource discovery)

I am building such an infrastructure NOW - hope to post by end of next week

> - The protocols in use tend to begin using XML. This is just because they
> must write
> <searchresult id="3">
> 	      <item id="4" name="file1" node=""/>
> 	      <item id="7" name="file3" node=""/>
> </searchresult>
> instead of
> {searchresult, [{id, 3}],
> 	       [
> 		{item, [{id, 4},{name, "file1"},
> 			{node, ""], []},
> 		{item, [{id, 7},{name, "file3"},
> 			{node, ""], []}
> 	       ]
> }
> This is really a matter of taste. Converting between the two is
> straightforward.
> - Let us see how Erlang works for the 4 areas outlined above (I only guess
> some of this stuff, please fill in the right situation if you know I'm
> wrong):
> -- connectivity: it is automatic using the underlying distribution
> mechanism; but will it scale? I doubt it strongly. A fully connected net is
> not manageable (not with thousands of nodes), so the connections should be
> kept limited. This creates the need of relaying messages between nodes that
> aren't directly connected, because it would be very elegant if the present
> location transparency would be kept. I.E. are Pids enough for identifying
> processes on nodes that aren't connected?

No - need IP's ports etc and STATE

> -- routing and gatewaying: nothing exists now that will help in this case,
> as far as I know. This is functionality that must be built in. One of the
> most important things is how to be able to bridge through firewalls, or
> over different kind of networks.

Smile :-) - we gould you segiography and communicate with gifs of yellow 
bananas on port 80 :-)

> -- security: here we have a big can of worms... as Erlang works now it is
> fully open for anyone knowing the cookie. Some studies have been made, but
> since we are only talking about exchanging messages, not code, we probably
> don't need SafeErlang yet. Probably it would be enough with a node that has
> a modified net_kernel AND it doesn't allow for more than message passing
> (no remote spawns). I'm not sure if the latter can be achieved only via
> net_kernel. There is also another problem: how to get all nodes have the
> same cookie? That might be possible to get around with a new net_kernel (if
> this control isn't buried deeper), and allow nodes with different cookies
> to connect, and possibly have different security policies for different
> cookies. This way a node can be a full node on the intranet, while being
> connected to the outside world too.

RSA with 2K bytes keys + Hiffie Hellman. My RSA is on the Erlang web site.
Diffie Hellamn is trivial (almost :-)

> -- search: this is mostly a p2p issue, so it isn't addressed in today's
> Erlang. A protocol needs to be defined and implemented, that will also rule
> the routing and gatewaying behaviour.
> - The big problem here is that there might be security issues that won't be
> noticed until it's too late. Because of that it is wiser to have a separate
> connection management, where we can more easily decide what's okay and
> what's not. This might ease up the task of bridgeing with different
> networks (instead of an IP socket we use another transport, or we go over
> HTTP). It won't be as elegant as Pid ! Msg, but I for one can live with
> send(Pid, Msg)
> :-)

> - Of course, one can write p2p applications without any such platform
> underneath. But it's kind of a waste to address the same issues for every
> application, and the one that each one of them will necessary meet is how
> to access nodes behind firewalls.
> What do you people think? Am I babbleing, or is there a trace of rational
> thinking?

No not at all - I babble with you - I plan to register a new domain name and 
host a myp2p.org (or whatever) at SICS with an Erlang P2p infracstructre
real soon -

Then you can all help write the apps.

The first real app (on top of IRC/chat/file sharing - which is easy) is

> best regards,
> Vlad

Have a nice one (as Ann-Louise used to say)


<<to work - now I have to write a project proposal 
to keep /// happy - c'est la vie>>

More information about the erlang-questions mailing list