Billion-triple store
Andrae Muys
andrae@REDACTED
Tue Apr 25 11:01:33 CEST 2006
On 24/04/2006, at 9:53 PM, Leif Johansson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Joel Reymont wrote:
>> Folks,
>>
>> How would I store a billion triples with Erlang?
>>
>> I don't necessarily need the full power of RDF as storing triples
>> in the
>> form of {"Joel", has_a, daughter} would suffice. I would not mind
>> complying with RDF of course but it seems that would be an extra
>> burder
>> due to the necessity of storing everything as strings, the need to
>> implement tries for those and the way Erlang stores strings.
>>
>> I'm not sure how to go about storing a billion of such triples in
>> Mnesia. I suppose I would need to use a 64-bit machine and a
>> disc_only_copy table.
>>
>> Any suggestions?
>
> I'd also like an answer to that question. I did some experiments but
> don't understand the way to get mnesia to play nice. I assume you have
> looked at the way tripplestores are typically built with rdbms ? Some
> of the schemes used in things like 3store, sesame, kowari (?) might
> be translatable...
>
> I am interested in working on this.
Well as the lead maintainer of kowari, I would be very happy to
discuss any requirements you might have, and see if we can't help you.
Currently the largest scalability test I am aware of for kowari was
500million, but those results indicated that we hadn't reached our
limit yet. One of the store-layers designers did some calculations
that indicate that we should be able to scale to 1-2billion without
difficulty; although as one of the primary developers of the query
layer I am aware of some bottle necks that are likely to interfere
with any queries requiring extremely large intermediate results (~1e6
tuples).
At the same time, there are plans to address these issues, and to
break the scalability bottle necks that are preventing us reaching
1e10 and 1e11 at the moment, these include promising prototypes of a
new store design to improve locality and throughput that should
result in us scaling comfortably to 1e10.
As far as interfacing with erlang is concerned, we currently support
rmi and soap, as well as in-process java funcalls. I am currently
working on xmlrpc support, and I am aware of plans to introduce a
rest interface as well.
Please let me know if there is anything I can do to help.
Andrae
--
Andrae Muys
andrae@REDACTED
Principal Kowari Consultant
Netymon Pty Ltd
More information about the erlang-questions
mailing list