[erlang-questions] Conceptual questions on key-value databases for RDBMs users

Edmond Begumisa ebegumisa@REDACTED
Sun Nov 7 19:15:05 CET 2010


Hello Rudolph,

Small comment. I think one has to be careful not to suggest that blobs in  
typical RDBMSs and values in typical kv-stores are similar. I've heard  
that comparison being made before and forgive me, but I just don't see it  
(if that is indeed what what you were suggesting.)

In SQL-RDBMSs, blob fields are normally opaque to the db. Storing blobs is  
the exception rather than the norm so querying against the content of  
these is usually poorly catered for (if catered for at all) by the query  
engine. The result is you normally can't use blobs to store anything  
complex because querying against them is a pain (i.e you'd only store  
things in a blob that you don't intend on using in the criteria of an SQL  
statement.)

Where as for (I think most) kv-stores, the 'values' are not opaque.  
Storying complex data in the values is the norm, so most kv-stores are  
aware of the structural content of those values (e.g. JSON in CouchDB,  
Erlang-Terms in Mnesia.) The result is that kv-store query-engines have a  
mechanism for querying against the complex values being stored (views for  
Couch, qlc for Mnesia).

I would think this renders the use-cases for blobs in RDBM's and values in  
kv-stores in-comparable. For couch's case, the equivalent to a blob would  
be more like an attachment. For Mnesia, a blob would be closer to a binary  
member of a tuple/record.

> The issue is that the information in the "blob" is not useful. Neither  
> stored in the RDBMS, nor in a plain KV-store.

That said, I don't really a agree with this statement. In a typical  
kv-store the 'value' is useful because the kv-store query engine would  
understand how to parse it. A typical SQL query-engine would not.

I'm no Mnesia user, but I would think that you could easily query against  
parts of an Erlang-term parse tree stored in the database. To my  
knowledge, the only way to get the equivalent in an SQL-database would be  
to either...

a) Spread the parse tree across different tables/records (hierarchical  
data modeling difficulty with tables non-withstanding)
b) Use an atypical SQL database like Postgres/Oracle that supports  
complex/user-defined types
c) Cheat -- write some sort of plugin that you can from a stored-procedure

- Edmond -

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


On Sun, 07 Nov 2010 23:09:04 +1100, Rudolph van Graan  
<rvg@REDACTED> wrote:

> Hi Joe,
>
> I am not sure I agree with your statement that traditional databases  
> "suffer" from the fact that columns have simple types. You can easily  
> (within reason) make them store complex types. They are at the most  
> basic level also KV stores. The most basic abstraction for a "Table" is  
> a set of tuples stored in a certain way. There is no reason why you  
> cannot define a table with an arbitrary key and an arbitrary "blob" as  
> value column and then storing your parse tree in that blob field.  
> (Obviously you need to serialise/deserialise it - as you would need to  
> do for any storage abstraction).
>
> The issue is that the information in the "blob" is not useful. Neither  
> stored in the RDBMS, nor in a plain KV-store. You have to make it useful  
> before you can do something with it or extract information from it. I  
> guess that is why you said thus:
>
>> I'm not thinking in terms of joins and normalizing things - the thought  
>> process
>> is different - so far I haven't met any problems that don't map onto  
>> key-values
>> queries.
>
>
> I agree with you. If you throw distribution, map-reduce and lots of  
> indices into the equation you can solve most issues this way. Except for  
> one thing - invariance between multiple records and record types. There  
> are very few examples where it is useful to live without invariance -  
> searching being one example.
>
> Invariance is a first order concept in any RDBMS system and essential  
> for many types of systems. However, in my experience, you can achieve  
> application-level invariance in other ways. But it requires  
> substantially more work to approximate first order invariance using  
> KV-stores.  It is sometimes simpler to have your data properly  
> normalised I think.
>
> Rudolph van Graan
>
>
> On Nov 4, 2010, at 10:01 PM, Joe Armstrong wrote:
>
>> On Tue, Nov 2, 2010 at 9:14 PM, Silas Silva <silasdb@REDACTED> wrote:
>>> This is a message I sent to the nosql-discussion@REDACTED
>>> discussion group.  I thought it might be interesting to send it
>>> erlang-questions, so here we go...
>>>
>>>
>>> Hi all!
>>>
>>> I have used SQL RDBMSs for some time.  I've never used any very  
>>> advanced
>>> feature, but I know enough of it to make database applications.
>>>
>>> Nowadays, I decided it would be interesting to learn some NoSQL
>>> databases concepts.  So I decided to pick up some Erlang and Mnesia,  
>>> its
>>> native key-value database.  More than scalability itself, the most
>>> valuable feature for me is the possibility of replication and
>>> synchronization between nodes.
>>>
>>> But all pros have cons as companion.  The lack of a relationship model
>>> is difficult for who is used to RDBMSs.  So, my question is:
>>>
>>> * Is there any guide, tutorial, book, whatever, that tries to introduce
>>>  NoSQL databases to SQL users?
>>>
>>> * Key-value databases are surprising simple.  I know you solve
>>>  relationship by denormalizing data.  What data should be normalized?
>>>  What shouldn't?  How do you update denormalized data?
>>
>> I'm no database expert so don't quote me here ...
>>
>> As far as I am concerned traditional databases like SQL suffer
>> from the fact that the data stored in an individual column is incredible
>> simple - I can store an integer/string/... in a column but these are
>> incredibly simple data structures. I want to store and retrieve  
>> incredibly
>> complicated things - how do I store an XML parse tree in a single cell
>> of a database? - How do I store a database in a database ...
>>
>> In a decent K-V database the value of the key can be *anything* - an
>> entire database for example, a compiler, ... no worries
>>
>> Then when I analyse my problem I start thinking "I can store and  
>> retrieve any
>> complex object that keys do I need to solve my problem?"
>>
>> I'm not thinking in terms of joins and normalizing things - the thought  
>> process
>> is different - so far I haven't met any problems that don't map onto  
>> key-values
>> queries.
>>
>> It seems to my that SQL provides you with the ability do to complex
>> queries on simple things. K-V dbs can do simple queries on complex
>> things.
>>
>> /Joe
>>
>>
>>
>>>
>>> Sorry for such simple and general questions.  Things were simple up to
>>> the moment that I realized that it would be easily solved with a JOIN
>>> SQL statement.  :-)
>>>
>>> Thank you very much!
>>>
>>> --
>>> Silas Silva
>>>
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>>
>>>
>>
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>
>


-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


More information about the erlang-questions mailing list