[erlang-questions] Speed of CSV parsing: how to read 1M of lines in 1 second
Dmitry Kolesnikov
dmkolesnikov@REDACTED
Mon Mar 26 22:25:13 CEST 2012
Oh my bad.... I've completely forget of theses aspects of VM.
I boosted performance so that Max's original file is parsed with 2us per line vs 3.15us, a full ETL cycle (see my previous mail) takes just 7.8 us per line vs 8.39us.
and very good hint on Boyer-Moore searching...
- dmitry
On Mar 26, 2012, at 11:05 PM, Tim Watson wrote:
> Max have you seen
> http://blogtrader.net/blog/tim_bray_s_erlang_exercise2. This states
> "0.93 sec on 1 million lines file on my 4-core linux box" which sounds
> pretty impressive and is based on pure Erlang (with some ets thrown
> into the mix by the looks of things). Might be worth looking at
> whether this can potentially out-perform the NIF!
>
> On 26 March 2012 12:40, Max Lapshin <max.lapshin@REDACTED> wrote:
>>
>>
>>>
>>> And what do these numbers look like? Do they repeat? Are they short?
>>
>> Right as in example csv. It is trading data.
>>
>>
>>> Or are they high-precision and varying wildly in order of magnitude,
>>> and widely distributed statistically?
>>
>>
>> They are very close to each other and vary not more than several percents.
>> You think ot is a good place for optimization?
>>
>>
>> In fact I have achieved good enough results: less than a second and thank to
>> all community for it.
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120326/57763a1b/attachment.htm>
More information about the erlang-questions
mailing list