[erlang-questions] kv store as a service
Chris Molozian
chris.molozian@REDACTED
Mon Apr 7 01:14:09 CEST 2014
Hi,
Based on your requirements:
- unlimited disk space
- each value is an SVG document (and its history)
- minimum cost
- minimum administration
I wonder if you might be interested in a service like Orchestrate (http://orchestrate.io/docs/). I’m an engineer at Orchestrate, we’re big fans of Erlang as most of us worked on Riak before building this service.
I think Orchestrate could be a good fit because:
- the pricing model is based on MOps (per million requests) with a free tier of 1MOp/month.
- we do not charge for disk usage.
- all data in the service is immutable, with “ref” history (you can retrieve previous versions of a KV object)
- there is no administration overhead, we’re the infrastructure team for you
Normally I'm wary of promoting a company (when working for them) on a community mailing list, but in this case I really do think we could be a good fit.
Hope this helps.
Kind Regards,
Chris
--
Chris Molozian
Software Engineer
Sent with Airmail
On 5 April 2014 at 18:02:55, t x (txrev319@REDACTED) wrote:
I should also add, for this particular case, HDFS as a service would
work for me too -- all what I really need is just a k/v store.
On Sat, Apr 5, 2014 at 10:02 AM, t x <txrev319@REDACTED> wrote:
> Gokhan: Thanks for pointing out what I need to clarify.
>
>
> What kind of data?
>
> I'm basically building a wiki. Each "value" is an svg-document + the
> history of the svg document.
>
>
> Why do you need cache?
>
> S3 pricing comes out to $120 / TB.
> DO pricing is $5 / TB.
> Thus, I'd prefer DO to read from S3, cache it on DO, then serve from
> DO. This saves bandwidth cost by factor of 24. (sorry for not
> explaining this earlier)
>
>
> Why do you believe SSD will suffice for cache?
>
> The site is for a class -- so the notes are "weekly" -- thus, it's
> highly likely that the most accessed entries are the most recent
> entries (i.e. it's the user requests is not uniformly random over the
> keys; but rather heavily weighted in favor of recent keys)
>
>
> What is your retrieval pattern?
>
> I don't have hard data yet -- I'm still building this.
>
>
>
> (preempting a possible future question): Why do you group S3 and Riak
> into the same thought?
>
> Eventual consistency doesn't really matter to me here. key = sha hash
> of content, thus, "updates = new entry", and I don't worry much about
> invalidating cache.
>
> I was thinking purely in terms of k/v store -- and the two most
> scalable stores I know of are S3 and Riak.
>
>
> Please let me know if my thinking appears sloppy anywhere else. (I
> have a decent theoretical CS background so I can do the logic/proof --
> but this is my first time building a distributed system -- so I may be
> asking the wrong questions / not aware of what I'm jumping into).
>
>
> Thanks!
>
>
>
> On Sat, Apr 5, 2014 at 9:02 AM, Gokhan Boranalp <kunthar@REDACTED> wrote:
>> Amazon S3 and Riak are different species and actually not directly
>> comparable types in the nature of K/V world.
>> Question shows that you are not aware of usage types of these two and
>> you are not efficiently examined your problem domain by looking
>> closely to your data.
>>
>> Please let us know more about your data types to be used.
>> What kind of data really you would like to store?
>> Why do you need cache?
>> Why do you believe SSD disks could be sufficient for cache operations?
>> What is your data access pattern in terms of retrieval of data back?
>>
>>
>> BR
>>
>> On Fri, Apr 4, 2014 at 3:10 PM, t x <txrev319@REDACTED> wrote:
>>> Hi,
>>>
>>>
>>> This is my current setup:
>>>
>>> * a bunch of $5/month digital ocean droplets [1]
>>>
>>> * these droplets have a 20GB SSD harddrive
>>>
>>> Now, I need to have a gigantic key-value store. I don't want to deal
>>> with the error condition of "error: you ran out of disk space"
>>>
>>> In my particular design, I only have "create new value". I don't
>>> have "update value". Thus, I don't have to worry about invalidating
>>> caches, and intend to use the 20GB SSD drives as as "cache" for the
>>> real key-value store.
>>>
>>>
>>>
>>> Now, my question is: what should I use for my key-value store?
>>>
>>> I want to optimize for:
>>>
>>> * minimum cost
>>> * minimum administration
>>>
>>> Currently, the best I have is Amazon S3. (I'd prefer to not setup my
>>> own Riak cluster + deal with replication + how many servers to run +
>>> ... ). I'm okay with the 99.99% (or whatever SLA Amazon S3 provides).
>>>
>>>
>>>
>>> Question: Is S3 the right approach as a giant K-V store for my
>>> Erlang nodes to hit, or should I be using something else?
>>>
>>> Thanks!
>>>
>>>
>>> [1] This is not an advertisement for DO. I do not have any DO equity.
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>>
>> --
>> BR,
>> \|/ Kunthar
_______________________________________________
erlang-questions mailing list
erlang-questions@REDACTED
http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140407/c06eef72/attachment.htm>
More information about the erlang-questions
mailing list