[erlang-questions] Conceptual questions on key-value databases for RDBMs users
Edmond Begumisa
ebegumisa@REDACTED
Mon Nov 8 14:49:35 CET 2010
Alceste,
Very true. Hierarchical data can be expressed even in simple SQL databases
that don't support the self-reference/common-table constructs you
describe. You can do interesting things with indices and almost-circular
SQL-statements. Some DBs even have a whole API devoted to this sort of
thing -- I used Microsoft's "Data Shaping" extension to SQL for quite some
time. Nevertheless, it never feels as natural and is never as easy as it
is in NoSQL/kv-databases.
Your historical note is interesting. It's almost as though we are going
backwards with NoSQL, like in a lot of other areas of development that are
proclaimed as "new" (e.g. I always looked at Ajax/Web 2.0 and thought -
"wait a minute, that's just rich/smart client development that's been
around since the 90s!")
Your observation of data creeping into a database that's not consistent
with the application model is one of the arguments used for RDBMs against
NoSQL. The idea that relationships and constraints can be used to
safeguard bad data. I guess this is one of the things that you
weight/prioritise when choosing either to go NoSQL or SQL.
My problem with relationships is that they tend to be used to implement
too many things and introduce excessive complexity in an attempt to guard
against bad data. From my experience, SQL is uses relationships/joins for
two different things...
1) Containment: Orders contain items --- create a relationship.
2) Actual Relationships: Orders have corresponding invoices -- create a
relationship.
The problem is containment is such a common thing that a good chunk of
your relationships aren't really relationships. This is where relationship
diagrams and SQL queries start to bloat. To the extent that I sometimes
question whether the benefits gained from guarding against bad data are
negated by the complexity introduced.
NoSQL/kv-dbs on the on-the-other-hand tend to give you real containment
when you want it (even deep hierarchies if you need them), and
relationships when you actually want relationships. The cost is less
sophistication in auto-guarding against bad data.
The "SQL-way" is: don't trust the programmer, make sure everything fits
the per-defined data model (even if it is inconvenient/doesn't map nicely
against the application's model.)
The "NoSQL-way" is: trust the programmer, make it easier to model data
that is convenient for/maps nicely against the application's model.
For me, the NoSQL-way makes more sense. The database should work for the
programmer, not the other way around.
- Edmond -
On Mon, 08 Nov 2010 21:06:19 +1100, Alceste Scalas <alceste@REDACTED>
wrote:
>
> On Mon, 08 Nov 2010 04:16:43 +1100, "Edmond Begumisa"
> <ebegumisa@REDACTED> wrote:
>> The main reason I've switched to NoSQL (CouchDB) after many years of
>> using SQL-RDBMSs is for modeling hierarchical data. RDBMSs force you
>> to do this with relationships across different tables even when the
>> data belongs to the same domain/object -- this feels unnatural and
>> results in the DB equivalent of spaghetti code with hours staring at
>> relationship diagrams.
>
> The relational model can express hierarchical relations. SQL cannot
> express the whole relational algebra, but it does allow to handle
> hierarchical relations. You can use self-referencing tables:
>
> CREATE TABLE hierarchy (
> id INTEGER PRIMARY KEY,
> parent_id INTEGER
> REFERENCES hierarchy(id)
> ON UPDATE CASCADE
> ON DELETE RESTRICT,
> ...
> );
>
> And then you could use common table expressions in queries:
>
> http://en.wikipedia.org/wiki/Common_table_expressions
>
> Funny note: the relational model was developed 40 years ago in
> order to provide a declarative and formal method for structuring
> data, and overcome the limitations of the DBMSs of the time ---
> which were, in fact, mostly hierarchical and document-oriented,
> and required a procedural interface in order to perform queries.
>
> As a result, the data in hierarchical DBMSs was structured without
> formal analysis, and only considering the immediate requirements of
> the application that was going to use it. When other applications
> needed to access the DB, or the application itself required updates
> which broke the mapping between DB objects and application objects,
> then *huge* pains would arise...
>
> Regards,
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
More information about the erlang-questions
mailing list