[erlang-questions] Using Key/Value store with shared storage

Mahesh Paolini-Subramanya mahesh@REDACTED
Mon Dec 10 22:29:56 CET 2012


> Now this is interesting to me.  High maintenance overheads would be
> offputting.  Was your SAN configuration particularly complex?  What
> sort of issues were you having?
Once you actually get your fibre/infiniband/10G/whatever adapters, you'll be surprised at how much lower your throughput will be compared to what the spec's say they should be.
Once you (and your vendor) are done tuning, reconfiguring, etc., You'll almost certainly end up sharding across different zones, oh, other sorts of fun stuff.
The end result is complexity - complex hardware, complex maintenance/monitoring issues, and complex intellectual grappling w/ the various edge-cases that you'll be dealing with.
SANs are remarkably good for many things, but for the 'firehose' like throughput that you want/need, they're quite probably not feasible…

cheers


Mahesh Paolini-Subramanya
That Tall Bald Indian Guy...
Google+  | Blog   | Twitter  | LinkedIn

On Dec 10, 2012, at 12:54 PM, Sean D <seand-erlang@REDACTED> wrote:

> Thanks for the comments.  I have included a few points inline.
> 
> Cheers,
> Sean
> 
> On Mon, Dec 10, 2012 at 10:21:32AM -0800, Mahesh Paolini-Subramanya wrote:
>>     This seems flawed to me. Do you want resiliency, or shared storage?
>> 
>>   Ooooh.  Troll! :-)
>> 
>>   Seriously though - as Max Lapshin points out, w/ the exception of doing
>>   some very interesting (read complex, and potentially destabilizing)
>>   architecting w/ infiniband, multiple links and cascade setups, you are
>>   going to be seriously bottle-necking on your shared storage.
> 
> I'm wondering if the term "Shared Storage" has caused some confusion.  When
> I was talking about shared storage, I was talking about using a high-end
> SAN.
> 
> The reason for this is because I believe they are designed to deal with
> this type of scenario.  This is obviously dependent on the quality of the
> SAN though.  Budgetary constraints are likely to determine whether or not
> this will be worth my while.
> 
> Does anyone have any views on how much overhead there is in keeping nodes in
> sync in a Riak-type solution.  Does this affect I/O performance or does it
> simply require extra more processing power?
> 
>>   And, to Garret's point, resiliency and shared storage are (kinda)
>>   orthogonal.  An appropriately spilled can of coke can wreak havoc to
>>   your shared storage, and BigCouch/Riak/Voldemort/... can be remarkably
>>   resilient.
> 
> Again, SANs are designed to be highly resilient.  I would hope that the
> appropriate spilling of a can of coke would need to be highly malicious in
> order to bring down a SAN.
> 
> I am not doubting any of these technologies are not resilient.  I would just
> rather avoid duplicating data.
> 
>>   A few years back, we migrated out of the "eggs in one basket" SAN
>>   approach to BigCouch. Our SAN setup was increasingly starting to look
>>   something that would give even Rube Goldberg nightmares, and the sheer
>>   amount of hackery associated with this was starting to keep me up at
>>   nights.
>> 
>>   Anyhow, just my two bits...
> 
> Now this is interesting to me.  High maintenance overheads would be
> offputting.  Was your SAN configuration particularly complex?  What
> sort of issues were you having?
> 
>>   [1]Mahesh Paolini-Subramanya
>>   That Tall Bald Indian Guy...
>>   [2]Google+  | [3]Blog   | [4]Twitter  | [5]LinkedIn
>>   On Dec 10, 2012, at 9:18 AM, Garrett Smith <[6]g@REDACTED> wrote:
>> 
>>     Aye, as well, this is curious:
>> 
>>     we would like to make the solution more resilient by using shared
>>     storage
>> 
>>     On Mon, Dec 10, 2012 at 10:59 AM, Max Lapshin
>>     <[7]max.lapshin@REDACTED> wrote:
>> 
>>     You speak about many servers but one SAN.
>>     What for? Real throughput is limited about 3 gbps. It means that you
>>     will be
>>     limited in writing no more than 10000 values ok 30kb per second.
>>     There will be no cheap way to scale after this limit if you use
>>     storage, but
>>     if you shard and throw away your raid, you can scale.
>>     Maybe yo
>>     On Monday, December 10, 2012, Sean D wrote:
>> 
>>     Hi all,
>>     We are currently running an application for a customer that stores a
>>     large
>>     number of key/value pairs.  Performance is important for us as we
>>     need to
>>     maintain a write rate of at least 10,000 keys/second on one server.
>>      After
>>     evaluating various key/value stores, we found Bitcask worked
>>     extremely
>>     well
>>     for us and we went with this.
>>     The solution currently has multiple servers working independently of
>>     each
>>     and we would like to make the solution more resilient by using
>>     shared
>>     storage. I.e.  If one of the servers goes down, the others can pick
>>     up the
>>     work load and add to/read from the same store.
>>     I am aware that Riak seems to be the standard solution for a
>>     resilient
>>     key-value store in the Erlang world, however from my initial
>>     investigations,
>>     this seems to work by duplicating the data between Riak nodes, and
>>     this is
>>     something I want to avoid as the number of keys we are storing will
>>     be in
>>     the range of 100s of GB and I would prefer that the shared storage
>>     is used
>>     rather than data needing to be duplicated.  I am also concerned that
>>     the
>>     overhead of Riak may prove a bottle-neck, however this isn't
>>     something
>>     that
>>     I have tested.
>>     If anyone here has used a key/value store with a SAN or similar in
>>     this
>>     way,
>>     I'd be very keen to hear your experiences.
>>     Many thanks in advance,
>>     Sean
>>     _______________________________________________
>>     erlang-questions mailing list
>>     [8]erlang-questions@REDACTED
>>     http://erlang.org/mailman/listinfo/erlang-questions
>> 
>>     _______________________________________________
>>     erlang-questions mailing list
>>     [9]erlang-questions@REDACTED
>>     http://erlang.org/mailman/listinfo/erlang-questions
>> 
>>     _______________________________________________
>>     erlang-questions mailing list
>>     [10]erlang-questions@REDACTED
>>     http://erlang.org/mailman/listinfo/erlang-questions
>> 
>> References
>> 
>>   1. http://www.gravatar.com/avatar/204a87f81a0d9764c1f3364f53e8facf.png
>>   2. https://plus.google.com/u/0/108074935470209044442/posts
>>   3. http://dieswaytoofast.blogspot.com/
>>   4. https://twitter.com/dieswaytoofast
>>   5. http://www.linkedin.com/in/dieswaytoofast
>>   6. mailto:g@REDACTED
>>   7. mailto:max.lapshin@REDACTED
>>   8. mailto:erlang-questions@REDACTED
>>   9. mailto:erlang-questions@REDACTED
>>  10. mailto:erlang-questions@REDACTED
> 
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121210/13aa243d/attachment.htm>


More information about the erlang-questions mailing list