A utility for sorting really big files.

  • Profile!
  • Parallelize the final n-way merge. This will require adding IPC.
  • Filter unique lines during the earlier passes.
  • Try out zlib-ng, about half of cpu time is spent on (un)gzipping.
  • Improve memory estimation, it lowballs and that hurts the presort.
  • Byte-based seeking instead of line-counting.
