[erlang-questions] Twoorl: an open source Twitter clone

Wed Jun 4 01:21:37 CEST 2008

The stream of incoming tweets is bursty. Even if the stream were level, the
backend load for store and retrieve would look nonlinear, because it depends
on the number of followers for each poster. So the system needs a lot of
expensive slack capacity to provide a reasonable quality of service, and
this comes on top of a growing user base.

Whether the answer is to hammer a database to retrieve a user's message
queue, or to burn storage, I'm wondering if the structure of the social
graph might suggest a way of distributing the storage and workload--stuff
like the number, size and stability of connected components, and the
feasibility of splitting larger components into communities, with the goal
of maximizing the probability that a message delivered to a server will be
retrieved by another user on that server. But I suppose that could get messy
quickly as new cross-server clusters form on the graph. Anyway, I'd be
curious to see an analysis of real data on the dynamics of Twitter's message
stream and its social graph.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080603/aea3c5ac/attachment.htm>