[erlang-questions] large objects in dets

Steve Jenson stevej@REDACTED
Fri May 11 09:42:27 CEST 2007

Here's a start:


Just an FYI:

Hadoop's DFS is different from GFS in at least one key way; Hadoop is
(at least from the last time I checked) WORM and GFS files were
designed to be appended to very often. Also, GFS handles small files
very poorly. Of course, GFS had all kinds of tuning parameters so you
could get a GFS cluster to handles lots of different scenarios but you
had to be prepared for the trade-offs. Although it sounds like you
aren't planning to append to your files, either.

I agree with the opinion of sticking files on disk and using mnesia to
keep track of where you placed them. You can avoid overloading your
file system by using a multilevel directory structure. Of course, that
only matters if you are planning on storing hundreds of thousands of
files. Most modern unix file systems can actually handle tens of
thousands of files in a single directory without trouble but your
sysadmins would rather you didn't try run any commands that are O(N)
the number of files like pesky 'ls'. Thousands of files in a directory
should present you with no trouble.


On 5/10/07, Joe Armstrong <erlang@REDACTED> wrote:
> Any idea where the hadoop inter-machine protocol is *defined*
> /Joe
> On 5/10/07, yerl@REDACTED <yerl@REDACTED> wrote:
> > Hey!
> > >(ZFS in solaris is nice). Other alternatives are mogileFS, nutch file system
> >
> > You mean "hadoop", nutch is a "search engine + crawler" ;-)
> >
> > cheers
> > Younès
> >
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions

More information about the erlang-questions mailing list