[erlang-questions] Conceptual questions on key-value databases for RDBMs users

Mon Nov 8 08:13:04 CET 2010

I should have probably used the term "key-value database" as the original  
poster did to better home in on the category. Specifically, that class of  
key-based NoSQL databases that you would list when starting a new project  
and looking for an alternative to RDBMSs which is what I assumed the  
thread was about.

Mnesia, CouchDB, MongoDB, Cassandra, etc, all have query engines that are  
aware of how to access the structures being stored. When querying, you can  
go far beyond just "give me the value with this key". This would be too  
tedious for the kind of datasets used in RDBMSs that you're looking to  
store in NoSQL databases instead. So typically, these kv-databases allow  
you to construct queries that can access the contents of the structures  
being stored, in-fact this is the norm. You then hand the query over to  
the query engine and say "fetch me this data with this criteria".

- Edmond -

On Mon, 08 Nov 2010 08:09:44 +1100, Morten Krogh <mk@REDACTED> wrote:

> Hi,
>
> Edmond, you talk about datastores that can peek inside the values  
> instead of regarding them as opaque binaries.
> But are those datastores really kv-stores?
> I would claim that, by definition, kv-stores can only do get, put,  
> delete and maybe some batch insertions and retrievals, but not more.
> Couchdb and mnesia should probably not be named kv-stores, but rather  
> something more descriptive of their capabilities.
>
>
> Also, no-sql is not the same as key-value.
> Part of the problem is that no-sql is too loose and broad a category.
>
> Morten.
>
>
> inside th eI agree with your points about data stores that can look  
> inside the values
>
>
> On 11/7/10 7:15 PM, Edmond Begumisa wrote:
>> Hello Rudolph,
>>
>> Small comment. I think one has to be careful not to suggest that blobs  
>> in typical RDBMSs and values in typical kv-stores are similar. I've  
>> heard that comparison being made before and forgive me, but I just  
>> don't see it (if that is indeed what what you were suggesting.)
>>
>> In SQL-RDBMSs, blob fields are normally opaque to the db. Storing blobs  
>> is the exception rather than the norm so querying against the content  
>> of these is usually poorly catered for (if catered for at all) by the  
>> query engine. The result is you normally can't use blobs to store  
>> anything complex because querying against them is a pain (i.e you'd  
>> only store things in a blob that you don't intend on using in the  
>> criteria of an SQL statement.)
>>
>> Where as for (I think most) kv-stores, the 'values' are not opaque.  
>> Storying complex data in the values is the norm, so most kv-stores are  
>> aware of the structural content of those values (e.g. JSON in CouchDB,  
>> Erlang-Terms in Mnesia.) The result is that kv-store query-engines have  
>> a mechanism for querying against the complex values being stored (views  
>> for Couch, qlc for Mnesia).
>>
>> I would think this renders the use-cases for blobs in RDBM's and values  
>> in kv-stores in-comparable. For couch's case, the equivalent to a blob  
>> would be more like an attachment. For Mnesia, a blob would be closer to  
>> a binary member of a tuple/record.
>>
>>> The issue is that the information in the "blob" is not useful. Neither  
>>> stored in the RDBMS, nor in a plain KV-store.
>>
>> That said, I don't really a agree with this statement. In a typical  
>> kv-store the 'value' is useful because the kv-store query engine would  
>> understand how to parse it. A typical SQL query-engine would not.
>>
>> I'm no Mnesia user, but I would think that you could easily query  
>> against parts of an Erlang-term parse tree stored in the database. To  
>> my knowledge, the only way to get the equivalent in an SQL-database  
>> would be to either...
>>
>> a) Spread the parse tree across different tables/records (hierarchical  
>> data modeling difficulty with tables non-withstanding)
>> b) Use an atypical SQL database like Postgres/Oracle that supports  
>> complex/user-defined types
>> c) Cheat -- write some sort of plugin that you can from a  
>> stored-procedure
>>
>> - Edmond -
>>
>> -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>>
>>
>> On Sun, 07 Nov 2010 23:09:04 +1100, Rudolph van Graan  
>> <rvg@REDACTED> wrote:
>>
>>> Hi Joe,
>>>
>>> I am not sure I agree with your statement that traditional databases  
>>> "suffer" from the fact that columns have simple types. You can easily  
>>> (within reason) make them store complex types. They are at the most  
>>> basic level also KV stores. The most basic abstraction for a "Table"  
>>> is a set of tuples stored in a certain way. There is no reason why you  
>>> cannot define a table with an arbitrary key and an arbitrary "blob" as  
>>> value column and then storing your parse tree in that blob field.  
>>> (Obviously you need to serialise/deserialise it - as you would need to  
>>> do for any storage abstraction).
>>>
>>> The issue is that the information in the "blob" is not useful. Neither  
>>> stored in the RDBMS, nor in a plain KV-store. You have to make it  
>>> useful before you can do something with it or extract information from  
>>> it. I guess that is why you said thus:
>>>
>>>> I'm not thinking in terms of joins and normalizing things - the  
>>>> thought process
>>>> is different - so far I haven't met any problems that don't map onto  
>>>> key-values
>>>> queries.
>>>
>>>
>>> I agree with you. If you throw distribution, map-reduce and lots of  
>>> indices into the equation you can solve most issues this way. Except  
>>> for one thing - invariance between multiple records and record types.  
>>> There are very few examples where it is useful to live without  
>>> invariance - searching being one example.
>>>
>>> Invariance is a first order concept in any RDBMS system and essential  
>>> for many types of systems. However, in my experience, you can achieve  
>>> application-level invariance in other ways. But it requires  
>>> substantially more work to approximate first order invariance using  
>>> KV-stores.  It is sometimes simpler to have your data properly  
>>> normalised I think.
>>>
>>> Rudolph van Graan
>>>
>>>
>>> On Nov 4, 2010, at 10:01 PM, Joe Armstrong wrote:
>>>
>>>> On Tue, Nov 2, 2010 at 9:14 PM, Silas Silva <silasdb@REDACTED> wrote:
>>>>> This is a message I sent to the nosql-discussion@REDACTED
>>>>> discussion group.  I thought it might be interesting to send it
>>>>> erlang-questions, so here we go...
>>>>>
>>>>>
>>>>> Hi all!
>>>>>
>>>>> I have used SQL RDBMSs for some time.  I've never used any very  
>>>>> advanced
>>>>> feature, but I know enough of it to make database applications.
>>>>>
>>>>> Nowadays, I decided it would be interesting to learn some NoSQL
>>>>> databases concepts.  So I decided to pick up some Erlang and Mnesia,  
>>>>> its
>>>>> native key-value database.  More than scalability itself, the most
>>>>> valuable feature for me is the possibility of replication and
>>>>> synchronization between nodes.
>>>>>
>>>>> But all pros have cons as companion.  The lack of a relationship  
>>>>> model
>>>>> is difficult for who is used to RDBMSs.  So, my question is:
>>>>>
>>>>> * Is there any guide, tutorial, book, whatever, that tries to  
>>>>> introduce
>>>>>  NoSQL databases to SQL users?
>>>>>
>>>>> * Key-value databases are surprising simple.  I know you solve
>>>>>  relationship by denormalizing data.  What data should be normalized?
>>>>>  What shouldn't?  How do you update denormalized data?
>>>>
>>>> I'm no database expert so don't quote me here ...
>>>>
>>>> As far as I am concerned traditional databases like SQL suffer
>>>> from the fact that the data stored in an individual column is  
>>>> incredible
>>>> simple - I can store an integer/string/... in a column but these are
>>>> incredibly simple data structures. I want to store and retrieve  
>>>> incredibly
>>>> complicated things - how do I store an XML parse tree in a single cell
>>>> of a database? - How do I store a database in a database ...
>>>>
>>>> In a decent K-V database the value of the key can be *anything* - an
>>>> entire database for example, a compiler, ... no worries
>>>>
>>>> Then when I analyse my problem I start thinking "I can store and  
>>>> retrieve any
>>>> complex object that keys do I need to solve my problem?"
>>>>
>>>> I'm not thinking in terms of joins and normalizing things - the  
>>>> thought process
>>>> is different - so far I haven't met any problems that don't map onto  
>>>> key-values
>>>> queries.
>>>>
>>>> It seems to my that SQL provides you with the ability do to complex
>>>> queries on simple things. K-V dbs can do simple queries on complex
>>>> things.
>>>>
>>>> /Joe
>>>>
>>>>
>>>>
>>>>>
>>>>> Sorry for such simple and general questions.  Things were simple up  
>>>>> to
>>>>> the moment that I realized that it would be easily solved with a JOIN
>>>>> SQL statement.  :-)
>>>>>
>>>>> Thank you very much!
>>>>>
>>>>> -- Silas Silva
>>>>>
>>>>> ________________________________________________________________
>>>>> erlang-questions (at) erlang.org mailing list.
>>>>> See http://www.erlang.org/faq.html
>>>>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>>>>
>>>>>
>>>>
>>>> ________________________________________________________________
>>>> erlang-questions (at) erlang.org mailing list.
>>>> See http://www.erlang.org/faq.html
>>>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>>>
>>>
>>
>>
>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/