[erlang-questions] Storing images

Garrett Smith g@REDACTED
Mon May 19 20:34:25 CEST 2014


On Sun, May 18, 2014 at 10:52 PM, Lloyd R. Prentice
<lloyd@REDACTED> wrote:
> Hello,
>
> Suppose I anticipate unknown thousands of records

But what do you have to deal with *now*?

> each containing meta-data associated with an image. Each image averages, say, 100KB. Retrieval is by key and queries on the meta data. Retrieval predominates over insert. Reliability and speed of access are high on the requirements list.
>
> I'm considering Riak. But...

You'll learn a lot about Riak, but not necessarily the problem that
you're trying to solve *now* :)

> - Would it be best practice to store the each image in its associated record or would it be better to store the images separately in, say, directories or in a separate distributed file system such as leoFS?

It's fun to learn about file systems, but what do you need right *now* :)

I realize it's a dreadfully boring, but consider possibly more direct
paths to getting things to work:

- Store everything, metadata + binary payload as an Erlang tuple in
dets (you can judge if the 2GB limit is a problem - this would give
you ~ 20K entries)

- If the 2GB limit makes you squeamish, store the tuple as an Erlang
encoded binary in something like LevelDB (which is solidly supported
now thanks to Basho's work)

- Store the images in a directory with the metadata stored as
formatted Elang terms (easy to write, easy to read) in text files
alongside the images (e.g. 0000001.jpg + 0000001.meta)

You add an index on top of any of these to improve performance, but
this is an optimization. I like to run full scans just to see how fast
my computer is. For "thousands" I wouldn't worry until you see a
problem.

Consider this solution temporary until you need to change it -- who
knows, the direct approach just might live longer than you think. And
if you do need to change the system, you'll have a lot more data to
help in your next steps.

I'm not trying to second guess your requirements -- you know them
best. But I find it can be helpful to look closely at the stupider
options as well as the fancier ones.

Garrett



More information about the erlang-questions mailing list