GZ-Sort

A utility for sorting really big files.

Comments are moderated. It may take a few minutes before your comment appears.
Markdown is supported in your comments.

It should also be noted that the memory use patterns are dramatically different. Gnu-sort hogs as much memory as it can take for the entire run. Gz-sort is similar but releases all that memory back to the OS once the first pass is completed. So while sorting the 425GB dataset, it used all the ram for the first hour and then dropped back to 10MB for the other eight. Additionally there is zero IPC between the threads, further saving resources. The disk access patterns are all linear reads and writes, with an average IO of 25MB/sec. (The SSD had a random-write performance of 300MB/sec.) After that initial ram-heavy presort, if gz-sort was nice-ed you wouldn't even notice it processing hundreds of gigabytes of data. Gz-sort scales smoothly and predictably and it is very good at not hogging the computer.

Name:
Mail: (not shown)

Please type this: