[erlang-questions] What's the best way to select articles by page in a forum?

Mon Jan 12 11:21:02 CET 2009

I'm neither an SQL nor mnesia expert (yet...), but I'd ask a couple of
things first:

* Is the memory use of storing twenty items at once really actually a
big problem?  If they're just forum posts, wouldn't each record mostly
just be a smallish blob of text (unless you're routinely dealing with
large attachments).  I suppose if you were running hundreds or thousands
of such queries at once it could start to mount up, but I see so many
cases where people assume that something is going to be a performance or
memory issue without actually looking at any solid data that it's
usually a good question to ask.

* If you're using transactions in SQL, which I assume you are, then it
seems to me that the results must end up being stored in memory (or
somewhere) anyway, even if you only access them one at a time after
issuing the query.  Otherwise a subsequent transaction could alter or
delete them before you're done iterating them and where would you pull
the results from then?

And the second question kind of leads into what I suspect is the answer.
 I don't think there's a non-dirty way to do it (possibly not even a
dirty one - I can't immediately see one), and certainly not without
holding open a transaction lock for the whole period you need the results.

I guess if you don't care about transactional integrity, there may be
ways around it, but it seems like a web forum would care very much about
such things :)

Any mnesia experts should of course feel free to point out why I'm an
idiot :)

Cheers,

Bernard

Liu Yubao wrote:
> Hi,
> 
> I'm trying to implement a web forum with Erlang + Mnesia and meet a big difficult
> problem. This is the record to represent an article:
> 
>   -record(article, {id, parent_id, topic_id, author, title, content, timestamp}).
> 
> For a new article:
>     id = md5(author + title + timestamp).
>     parent_id = <<0>>.
>     topic_id = id.
> 
> For a reply:
>     parent_id = parent's id.
>     topic_id = parent's topic_id.
>     id = md5(parent_id + topic_id + author + title + timestamp).
> 
> I use md5 digests instead of monotone increasing integers as id because I hope
> there is no centralized counter that is bad for concurrency and distribution.
> 
> Now the problem is how I select [N, N + 20) topics efficiently and how I select
> [N, N + 20) articles in a topic efficiently.
> 
> In SQL I can do like this:
> 
> 	SELECT * FROM article WHERE id = topic_id ORDER BY timestamp
>         SELECT * FROM article WHERE topic_id = ? ORDER BY timestamp
> 
> and then read the result *one by one*, but it seems mnesia doesn't support this
> traverse, it only support two kinds of traverse:
> 
>    * traverse whole table
>      (not efficient)
>    * get the whole result set immediately and traverse records in it
>     (consume many memory)
> 
> Any suggestion is appreciated!
> 
> 
> Best regards,
> 
> Liu Yubao
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions