[erlang-questions] Speed of CSV parsing: how to read 1M of lines in 1 second
james
james@REDACTED
Mon Mar 26 00:48:27 CEST 2012
> mmap is the fastest way to read lines is you don't much care about
portability.
While I think mmap is useful (and might see it as a way to avoid a split
being sequential, since you only really need to divide it roughly and
can probe for EoL), I think its worth qualifying your statement.
mmap is NOT necessarily the fastest way to read lines.
The issue is whether the operating system will perform read-ahead in its
disk system and how many times you fault and wait for a real disk IO if
not, so its rather important to know if the file is actually in the
operating system's VM cache already, or is actually on a disk drive.
As a first cut it would be handy to know how fast the OP's hardware can
do atof, compared with the number of numbers in the file.
More information about the erlang-questions
mailing list