Parsing big files

Robert Virding rv@REDACTED
Wed Dec 6 10:49:47 CET 2000


"James Hague" <jamesh@REDACTED> writes:
>Ulf's example is the way to go--processing a line at a time--but I thought
>I'd mention that with R7 it can be memory efficient to load and parse text
>files as *binaries*, not as text.  This prevents the 8x blow-up you get when
>a file is turned into raw text.  You can deal with much larger files this
>way.

That depends if you need to deal with whole files of can process a 
chunk at a time.  With Ulf's method you only take in small pieces at a 
time so there is no real blow-up.

The original message noted that reading in the whole file as a binary 
caused the runtime system to run out of memory.  Anyway if you are 
going to parse the contents you my have to convert it into lines, or 
whatever, which are lists, and then you have not really won that much.

So far the builtin scanning modules cannot handle binaries.

	Robert





More information about the erlang-questions mailing list