sext and Tokyo Tyrant (Re: [erlang-questions] sortable serialization format)

Ulf Wiger <>
Sat Oct 31 12:17:15 CET 2009


Ulf Wiger wrote:
> 
> A while ago I started hacking on a serialization format that
> would have the same sorting properties as Erlang terms.
> 
> I didn't quite get it to work (negative floats was the most
> difficult part), but when I returned to it today, I realized
> that it was only a very small problem. Once fixed, all my
> QuickCheck suites passed.

I just had to try this on Tokyo Tyrant, so I wrote a small
prototype for connecting to TT and encoding a few requests,
using the sext library to encode terms before sending them.

I realized that a new function was needed in sext: prefix(Term),
which encodes a 'prefix' that will match similar terms, and allow
some wildcarding. A prefix can't be decoded (at least, I didn't
write any code for doing so).

Some examples:

Eshell V5.7.1  (abort with ^G)
1> sext:encode({1,2,3}).
<<16,0,0,0,3,10,0,0,0,2,10,0,0,0,4,10,0,0,0,6>>
2> sext:prefix({1,'_','_'}).
<<16,0,0,0,3,10,0,0,0,2>>
3> sext:encode([1,2,3]).
<<17,10,0,0,0,2,10,0,0,0,4,10,0,0,0,6,0>>
4> sext:prefix([1,2|'_']).
<<17,10,0,0,0,2,10,0,0,0,4>>


Armed with this, I opened a B-tree table in Tokyo Tyrant,
and connected to it with my prototype module.

Eshell V5.7.1  (abort with ^G)
1> {ok,TT} = tt_proto:open(tt,[]).
{ok,<0.35.0>}
2> tt_proto:put(TT,{1,a}, 1).
ok
3> tt_proto:get(TT,{1,a}).
{ok,1}
4> tt_proto:put(TT,{1,b}, 2).
ok
5> tt_proto:put(TT,{1,c}, 3).
ok
6> tt_proto:put(TT,{2,a}, 4).
ok

Now, for some prefix matching:

7> tt_proto:keys(TT,{1,'_'}).
{ok,[{1,a},{1,b},{1,c}]}
8> tt_proto:keys(TT,{2,'_'}).
{ok,[{2,a}]}
9> timer:tc(tt_proto,keys,[TT,{1,'_'}]).
{279,{ok,[{1,a},{1,b},{1,c}]}}

I made no real effort to optimize anything. The module starts
a gen_server which keeps a connection open to ttserver. It handles
only one query at a time, but looking at the TCP protocol, it's
hard to see how it could to otherwise, as there is no tagging of
requests. The round trip times are going to be fairly high for simple
requests (compared to dets and mnesia on small data sets), but the
main benefit of using TT in the first place ought to be either that
the data set is uncomfortably large for mnesia and dets, or that one
wants ordered_set semantics on disk-based storage.

I put the tt_proto module in sext/examples/
There is some edoc for it too.

http://svn.ulf.wiger.net/sext/trunk/sext/doc/index.html

BR,
Ulf W


More information about the erlang-questions mailing list