A utility for sorting really big files.

  • The input: 3 billion lines, 30GB compressed, 425GB uncompressed.
  • The rig: quad-core A8-7600, 16GB ram, 256GB 850 Pro SSD with 90GB available.
  • The algorithm: a simple merge sort, predicted to finish in 10.2 hours. (Actual time, 9.5 hours.)
  • The output: 25.2GB compressed (16% smaller) with 5 million duplicate lines removed.
