[erlang-questions] Needed: Great big ordered int set.
Alex Arnon
alex.arnon@REDACTED
Mon Jul 8 20:14:26 CEST 2013
I'll probably end up doing just that, but was hoping I could resolve the thing in-process.
On 8 Jul 2013, at 20:36, Sergej Jurecko <sergej.jurecko@REDACTED> wrote:
> Why not just use mongodb, mysql or postgresql?
>
>
> Sergej
>
> On Jul 8, 2013, at 7:29 PM, Alex Arnon wrote:
>
>> - A single set of integers - 500M of them.
>> - This is a throwaway piece of data - once I've added all the values and iterated over them a couple of times, it is of no further use.
>> - Mutation (addition of an integer) speed is not very critical, however due to the size of the dataset, it should be "reasonable" - i.e. less than a millisecond per insertion on average.
>>
>>
>> On Mon, Jul 8, 2013 at 8:22 PM, Sergej Jurecko <sergej.jurecko@REDACTED> wrote:
>>> The data structure is a sorted list of integers. That 500M dataset number, is that over a single list of integers, or is that the sum of all lists of integers?
>>> What are the reliability requirements? Do you need redundancy and/or backups? It is a very different problem if a single server solution is enough, or if it requires a network of computers.
>>>
>>>
>>> Sergej
>>>
>>> On Jul 8, 2013, at 7:11 PM, Alex Arnon wrote:
>>>
>>> > Hi All,
>>> >
>>> > I need to implement a very large set of data, with the following requirements:
>>> > - It will be populated EXCLUSIVELY by 64-bit integers.
>>> > - The only operations will be:
>>> > - add element,
>>> > - get number of elements, and
>>> > - fold/foreach over the SORTED dataset.
>>> > - The invocation order will be strictly:
>>> > - create data structure,
>>> > - add elements sequentially,
>>> > - run one or more iteration operations,
>>> > - discard data structure.
>>> > - The size of the dataset MUST scale to 500M elements, preferably billions should be possible too.
>>> > - The data does not have to reside in memory - however, 32 to 64 GB of RAM may be allocated. (of course, these will be used by the OS buffer cache in case a file-based solution is chosen).
>>> >
>>> > In summary: Performance is not a must, but volume and the ability to iterate over the ordered values is.
>>> >
>>> > Thanks in advance!!!
>>> >
>>> > _______________________________________________
>>> > erlang-questions mailing list
>>> > erlang-questions@REDACTED
>>> > http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130708/3d0aa999/attachment.htm>
More information about the erlang-questions
mailing list