p2p summary (kind of)

Joe Armstrong joe@REDACTED
Thu Feb 14 11:00:29 CET 2002


> Hello everyone!
>
> After turning inside out the peer2peer concept for the last days, I reached
> some conclusions that might (some of them at least) be of interest for you.
> The ones that are just random greek ;-) just ignore them. The text is also
> available on the Wiki.
>
> - there are many protocols on the making, but none was really an "aha!"
> experience, except maybe Chord, recommended by Joe.
>
> - I don't have the time to build/implement a low-level protocol, even if
> that would give immediate reward by being able to connect to an already
> existing network

<ranting mode on>

No the existing ones are crap :-) - Gutella is like wandering into a random 
room in a random town in a random country and shouting at a random
group of passers by "has anybody got a fish"

I've been thinking about the "discovery protocol" for some time - and
the answer is ..... "Dewey" 

Assume we have producers and consumers of information
we need to discover who knows what.

If I have a car for sale the best way to sell it is to place an ad in the 
local paper in the "Pets section" - the ad might read

	Banana for sale
	one careful owner
	10,000$

This is how the internet works - crap - then we invent Google (a miscropic 
improvement)

If we have a  Key -> Value namespace (which is what the producer/consumer 
paradigm is) then we must agree on the Keys and what they mean.

"car" Means that tin thing that costs a lot spews out poisinous gases 
despoilse the environment and slows down personal transportation between
A and B.

"banana" means yellow thing whose skin you slip on.

So how do we agree on Keys (use the dewey decimal system, or library of 
congress catelogging system - librariens have thought about this for yonks).

How do you find info in the p2p system - by *carefully* probing the peer 
group and asking the right questions - by building catalogues and maintaining 
trust neworks (do I believe this information)

JXTR says "Info is keyed by ten zillion bit (random) globally unique 
identifiers." 

Now I can't even say "banana for sale" when I want to sell my car

Now I must say

	328fgfjghsfio37465o8ybamfo82347b3245325kjf741hkht for sale
             apply to 3985728356hfjahecuyrileuayct97kfkurfyselihfilaerh for   
             more details

And to make it technically respectable I'd do it XML and program it in Java
But that *basicaly* is what JXTR is (at the most primitive level) - with not 
even a hint as to HOW you go about finding the meading of the GUIDs.
(I might even specify it in UML and hire fifty consultants for good measure 
:-)

	No my infrastruce will use the dewey decimal system - and I will
keep data bases of who knows what and how reliable the information is

</ranting mode on>

That feels better :-)


>
> - Erlang offers "for free" much of the basic functionality that is needed
> to create a p2p network, through the distribution mechanism. However,
> Erlang's actual design probably doesn't scale up and if trying that it
> might end up just as Gnutella did...

Erlang is a programing language - PLs don't scale up!

>
> - What I am mostly interested is a general framework that will permit p2p
> applications to be built upon. This means that the basic services are to be
> at least: connectivity, routing and gatewaying, security, search (of any
> kind: for other peers, for data, for services/applications). The
> applications should be just plugins that use the connection provided, using
> their own protocol.

I've covered search (or resource discovery)

I am building such an infrastructure NOW - hope to post by end of next week

>
> - The protocols in use tend to begin using XML. This is just because they
> must write
>
> <searchresult id="3">
> 	      <item id="4" name="file1" node="12.12.12.12:3333"/>
> 	      <item id="7" name="file3" node="12.132.12.13:2424"/>
> </searchresult>
>
> instead of
>
> {searchresult, [{id, 3}],
> 	       [
> 		{item, [{id, 4},{name, "file1"},
> 			{node, "12.12.12.12:3333"], []},
> 		{item, [{id, 7},{name, "file3"},
> 			{node, "12.132.12.13:2424"], []}
> 	       ]
> }
>
> This is really a matter of taste. Converting between the two is
> straightforward.
>
> - Let us see how Erlang works for the 4 areas outlined above (I only guess
> some of this stuff, please fill in the right situation if you know I'm
> wrong):
>
> -- connectivity: it is automatic using the underlying distribution
> mechanism; but will it scale? I doubt it strongly. A fully connected net is
> not manageable (not with thousands of nodes), so the connections should be
> kept limited. This creates the need of relaying messages between nodes that
> aren't directly connected, because it would be very elegant if the present
> location transparency would be kept. I.E. are Pids enough for identifying
> processes on nodes that aren't connected?

No - need IP's ports etc and STATE

>
> -- routing and gatewaying: nothing exists now that will help in this case,
> as far as I know. This is functionality that must be built in. One of the
> most important things is how to be able to bridge through firewalls, or
> over different kind of networks.
>

Smile :-) - we gould you segiography and communicate with gifs of yellow 
bananas on port 80 :-)

> -- security: here we have a big can of worms... as Erlang works now it is
> fully open for anyone knowing the cookie. Some studies have been made, but
> since we are only talking about exchanging messages, not code, we probably
> don't need SafeErlang yet. Probably it would be enough with a node that has
> a modified net_kernel AND it doesn't allow for more than message passing
> (no remote spawns). I'm not sure if the latter can be achieved only via
> net_kernel. There is also another problem: how to get all nodes have the
> same cookie? That might be possible to get around with a new net_kernel (if
> this control isn't buried deeper), and allow nodes with different cookies
> to connect, and possibly have different security policies for different
> cookies. This way a node can be a full node on the intranet, while being
> connected to the outside world too.

RSA with 2K bytes keys + Hiffie Hellman. My RSA is on the Erlang web site.
Diffie Hellamn is trivial (almost :-)

> -- search: this is mostly a p2p issue, so it isn't addressed in today's
> Erlang. A protocol needs to be defined and implemented, that will also rule
> the routing and gatewaying behaviour.
>
> - The big problem here is that there might be security issues that won't be
> noticed until it's too late. Because of that it is wiser to have a separate
> connection management, where we can more easily decide what's okay and
> what's not. This might ease up the task of bridgeing with different
> networks (instead of an IP socket we use another transport, or we go over
> HTTP). It won't be as elegant as Pid ! Msg, but I for one can live with
> send(Pid, Msg)
>
> :-)


>
> - Of course, one can write p2p applications without any such platform
> underneath. But it's kind of a waste to address the same issues for every
> application, and the one that each one of them will necessary meet is how
> to access nodes behind firewalls.
>
> What do you people think? Am I babbleing, or is there a trace of rational
> thinking?

No not at all - I babble with you - I plan to register a new domain name and 
host a myp2p.org (or whatever) at SICS with an Erlang P2p infracstructre
real soon -

Then you can all help write the apps.

The first real app (on top of IRC/chat/file sharing - which is easy) is
backup.

>
> best regards,
> Vlad
>

Have a nice one (as Ann-Louise used to say)

/Joe

<<to work - now I have to write a project proposal 
to keep /// happy - c'est la vie>>



More information about the erlang-questions mailing list