Sunday, July 20, 2008

Compression and decompression shoot out

New benchmark!

I decided to do another test suite, this time including decompression. The data is a concatenation of two tar archives. The first tar contains 3+ GB of C:\Program Files from a Windows XP installation. The second tar is a 3+ GB Ubuntu installation, sans /usr which is on a separate partition. In total this amounts to 6.53 GB.

The test host is my girfriend's 3 GHz P4 with 1GB RAM and 2MB L2 cache. It is rated at ~ 6000 bogomips. The computer is running Ubuntu 8.04 with a custom 2.6.25.4 kernel. All files were on an fuse mounted ntfs partition.

In addition to processing time I tried to measure RAM usage with GNU time but I didn't get any meaningful results. I did manage to record page faults,and I may add those statistics later.

This time lha is out. It doesn't seem to like files this big. Since I was also going to benchmark decompression I decided to include lzo, which is allegedly very fast at decompression.

Complete list of the contestants:

  • bzip2 1.0.4 (-9k)
  • gzip 1.3.12 (-9c)
  • lrzip 0.23 (-w 9 -q)
  • LZMA SDK 4.43 (-9kq)
  • LZO library 1.08 (-9 -k)
  • RAR 3.71 (-m5)
gzip was invoked in redirect mode (-c) because I didn't want it to throw away the source file. This shouldn't really affect compression ratio or processing times.

Below are the results broken down into compressed size and ratio, compression time and decompression time.

Compressed size, from worst to best:

  • uncompressed: 7,011,041,280 (6.53 GB), ratio 1.000
  • lzo_________: 4,719,645,902 (4.40 GB), ratio 1.486
  • gzip________: 4,563,292,811 (4.25 GB), ratio 1.536
  • bzip2_______: 4,428,910,323 (4.12 GB), ratio 1.583
  • rar_________: 4,125,923,141 (3.84 GB), ratio 1,699
  • lzma________: 3,840,213,621 (3.58 GB), ratio 1,826
  • lrzip_______: 3,585,069,056 (3.34 GB), ratio 1.955

Wall clock compression time, from slowest to fastest:

  • lzma_: 8,409 s
  • lrzip: 7,904 s
  • rar__: 5,906 s
  • lzo__: 3,487 s
  • bzip2: 3,034 s
  • gzip_: 1,598 s
  • cat__: __111 s (for reference)
Wall clock decompression time, from slowest to fastest:

  • lrzip: 2,830 s
  • bzip2: 1,491 s
  • lzma_: __981 s
  • rar__: __604 s
  • gzip_: __503 s
  • lzo__: __449 s
  • cat__: __111 s (for reference)



The shell script used to gather these statistics is here.