Compressing pi
This is my old site that I'm keeping up for historical purposes and is no longer updated. You probably want to see my new site.
June 10, 2004
Compressing pi
Linux at 05:59:47 PM MT (link) | Apparently, the universe's most delicious transcendental doesn't make a good compression test. I forgot where, but I had downloaded 1 billion digits of pi off the Internet. They were separated into 1 million digits segments in 1 thousand text files. I was thinking of a good way to test the compression of several different compression programs, mostly just to see if rzip was really as good as I thought, and compressing pi seemed like the perfect test candidate.
And in the end, they all compressed pi down to the same size. It's indicative of pi having few repeating patterns (doh, I knew that) that could be detected by any of today's compression algorithms.
So, onto the tests. I took all the text files containing the pi digits and put them all into one file using tar (pi.tar shown below). I used gzip, bzip2, rzip, 7-zip, and rar (just to have a popular proprietary solution in the mix). Results (in KiB if it isn't obvious):
- pi.tar: 987012
- pi.7z: 438332
- pi.rar: 434792
- pi.tar.bz2: 434012
- pi.tar.gz: 470208
- pi.tar.rz: 433976
So, basically, all compressed pi.tar down to the same size, approx 430 MiB. I suppose it's comforting that rzip did have the smallest filesize, though it isn't anything of consequence.
I used several different machines to carry out the compression. The 7-zip and rar was created on Microsoft Windows XP (on an Athlon 3000+), while the rest were done on Linux machines. An Athlon 2200+ for the rzip, and a Athlon 1.3 GHz for the gzip and bzip2. Though I don't have any numbers and considering the speed of the machines, it seemed as gzip was the fastest, followed by bzip2, rzip, and 7-zip (I can't place rar). 7-zip definitely did take the longest, I had started it first on the fastest machine I had and it was the last to finish.
So, what was the point of this? Absolutely nothing!