[erlang-questions] External sorting for large files in Erlang

Zabrane Mickael zabrane3@REDACTED
Wed Aug 1 18:47:50 CEST 2012


Hey Joe,

On Aug 1, 2012, at 4:12 PM, Joe Armstrong wrote:

> On Tue, Jul 31, 2012 at 11:32 PM, Zabrane Mickael <zabrane3@REDACTED> wrote:
>> Hi,
>> 
>> I'm looking for something similar to this, but in Erlang:
>> http://code.google.com/p/externalsortinginjava/
>> 
>> I found an old post suggesting file_sorter:
>> http://www.erlang.org/doc/man/file_sorter.html
>> But file_sorter seems to only  work on binary files.
> 
> This is one of my favorite modules - it is very fast.
> 
> file_sorter sorts binary encoded terms.
> Each entry is a 4 byte length header followed by term_to_binary(Term)
> 
> Here's an example of how to encode some terms, write them to a file
> sort the file and read them back.

Thanks for sharing this code. Very useful.

> It happily sorts extremely large files .... well worth using

It seems to work fine, but not yet very flexible.

Can we imagine a module on top of file_sorter which will mimics the Unix sort command 
to work on plain text files (and not binary)?

At the end, I wanna be able to sort any disk file as with Unix sort.

> In my case, I need something more flexible.
>> What about controlling the Unix sort command from Erlang?
> 
>   os:cmd("sort <in >out").


Yep. That's what I had in mind (or port command).

Regards,
Zabrane

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120801/f1a4094b/attachment.htm>


More information about the erlang-questions mailing list