[erlang-questions] Amazon S3: Now with locking, transactions and caching
Joel Reymont
joelr1@REDACTED
Fri Jul 20 02:42:47 CEST 2007
Folks,
This is a status update on what I've been working for the past few
weeks.
You are probably well aware that Amazon S3 provides unlimited
scalability but does not have locking and transactions. There's also
a delay between the time when data is written to S3 and when it
becomes available for reading.
I tried several approaches but the best one turned out to be one of
hacking Mnesia internals to add s3_copies as a table type. I started
from scratch, as opposed to building up on Ulf Wiger's RDBMS but I
doubt I could have done it without reading the RDBMS code and asking
lots of questions, all of which Uffe was kind enough to asnwer.
Hacking Mnesia turned out to be a veritable pain in the rear as I had
to touch most of the modules, including today's extensive modding
session with mnesia_loader.erl. I will also need to apply the changes
to any upcoming releases of Mnesia.
I would say it was worth it, though, as I now can
- lock S3 buckets or "records" using {Bucket, Key}
- update several S3 records in a single transaction
- set up additional s3_copies replicas using mnesia:add_table_copy/3
- ensure that data is only written to S3 once
- have a large cluster of Yaws nodes use a small cluster of "master"
Mnesia nodes with s3_copies replicas, thus keeping replication and
transaction costs down.
I also coupled the virtual S3 table with a fixed-size cache that is
built on top of a regular Mnesia table. All writes go trough the
cache, which ensures that hot data is available immediately. So long
as the cache API is used, any cache misses are automatically
redirected to S3.
Thanks, Joel
--
http://topdog.cc - EasyLanguage to C# compiler
http://wagerlabs.com - Blog
More information about the erlang-questions
mailing list