[erlang-questions] Twoorl: an open source Twitter clone

Wed Jun 4 10:18:15 CEST 2008

Well once you start denormalizing and partitioning/sharding a database
you really start losing the advantages of having one.

It'd probably be easy enough to have an append-only file(s) on disk
strategy per user to represent their queue (something like disk_log
that is easy to read backwards), partitioning that would be very easy.
You could then have a second file or database that lists the incoming
and outgoing connections so the process that represents that user
knows which queues to publish and subscribe to.

On Tue, Jun 3, 2008 at 9:45 PM, Yariv Sadan <yarivsadan@REDACTED> wrote:
> I considered using a reliable queuing mechanism such as RabbitMQ or
> Amazon SQS but I don't think it would make the architecture inherently
> more scalable (more reliable maybe, but not more scalable). I think a
> Twitter like solution can be designed to scale using just Yaws, MySQL,
> and Mnesia or memcache (and maybe Ejabberd if you add an XMPP
> gateway). RabbitMQ or SQS would provide *reliable* asynchronous
> background processing, but if you don't need 100% reliability (Twoorl
> isn't a banking application after all), you can just spawn Erlang
> processes from Yaws to do background tasks after a user posts a
> message. Also, using persistent queues doesn't make the need for a
> database go away. When you pull a tweet from a queue you have to put
> it somewhere so it can be shown on rendered pages, and a database is
> the most reasonable place to put it. The main problem in scaling
> Twitter/Twoorl is how you architect your database backend --
> partitioning, denormalization, replication, load balancing, caching,
> etc, will probably make or break your ability to scale.
>
> Yariv
>
> On Sun, Jun 1, 2008 at 1:41 AM, David Mitchell <monch1962@REDACTED> wrote:
>> This is a REALLY interesting discussion, but at this point it's
>> becoming obvious that I don't know enough about Twitter...
>>
>> Are you suggesting that Twoorl should be architected as follows:
>> - when they register, every user gets assigned their own RabbitMQ
>> incoming and outgoing queues
>> - user adds a message via Web/Yaws interface (I know, this could be
>> SMS or something else later...)
>> - message goes to that user's RabbitMQ incoming queue
>> - a backend reads messages from the user's incoming queue, looks up in
>> e.g. a Mnesia table to see who should be receiving messages from that
>> user and whether they're connected or not.  If "yes" to both, RabbitMQ
>> then forwards the message to each of those users' outgoing queues
>> - either the receiving users poll their outgoing queue for the
>> forwarded message, or a COMET-type Yaws app springs to life and
>> forwards the message to their browser (again, ignoring SMS)
>>
>> This seems like a reasonable approach; I'm just curious if that's what
>> you're suggesting, or whether you've got something else in mind.
>>
>> Great thread, and thanks Yariv for getting this discussion going with Twoorl
>>
>> Regards
>>
>> Dave M.
>>
>> 2008/6/1 Steve <steven.charles.davis@REDACTED>:
>>>
>>> On May 31, 5:04 pm, "Yariv Sadan" <yarivsa...@REDACTED> wrote:
>>>> ...but it's the only way you can scale this kind of service when N is
>>>> big.
>>>
>>> Hmmm, Yariv, aren't you still thinking about this in the way that Dave
>>> Smith pointed to as the heart of the issue? i.e.
>>> Dave said: "My understanding is that the reason they have such poor
>>> uptime is due to the fact that they modeled the problem as a web-app
>>> instead of a messaging system."
>>>
>>> I'm aware that you are likely a good way away from hitting any
>>> scalability problems, but some kind of tiering would seem to be
>>> appropriate if twoorl is to be "twitter done right". Yaws at the front
>>> end, definitely - but rather /RabbitMQ/ at the back end. I believe
>>> that you'd then have the flexibility to distribute/cluster as
>>> necessary to scale to the required level (whatever that may be).
>>>
>>> For sure, Twoorl is a great demo of what can be done with Erlang in an
>>> incredibly short time. I'm a relative noob to Erlang, and have learned
>>> a great deal from your blog/code/examples.
>>>
>>> Steve
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://www.erlang.org/mailman/listinfo/erlang-questions
>>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>