GZ-Sort

A utility for sorting really big files.

Comments are moderated. It may take a few minutes before your comment appears.
Markdown is supported in your comments.

  • Profile!
  • Parallelize the final n-way merge. This will require adding IPC.
  • Filter unique lines during the earlier passes.
  • Try out zlib-ng, about half of cpu time is spent on (un)gzipping.
  • Improve memory estimation, it lowballs and that hurts the presort.
  • Byte-based seeking instead of line-counting.
Name:
Mail: (not shown)

Please type this: