The Great CD Ripping Project

After nine and a half months (!) of progress, two hard drives with a total of three-quarters of a terabyte of usable space, and over a thousand CDs, my project to rip all my CDs to losslessly compressed digital files is finished. There are other projects ahead, metadata updates (I have over two thousand tracks in my library with no year, for instance) and ripping obscure vinyl to name two. But the heavy lifting is over.

How heavy was the lifting? Well, here’s the data:

  • Tracks: 13,978
  • Total time: 42 days, 2 hours, 40 minutes, 51 seconds
  • Disk space: 312.81 GB
  • Artists: 1081
  • Albums: 1029

Below are some charts and graphs that show my progress and highlight some interesting data points along the way:

progress chart for ripping the library losslessly

Statistics for AAC lossless files

This table compares some aggregate statistics, including average track size in megabytes and albums and tracks per artist, for the CD part of my digital music collection, as losslessly ripped using Apple’s lossless AAC codec, to the rest of the music, which includes 128 and 192kbps MP3s and 128kbps protected AACs as well as various formats from podcasts. Interesting points here are a statistically significant “average file size” for lossless AAC across multiple genres. Personally interesting for me is how different the tracks per artist numbers are for the Project, where I was ripping whole CDs, and the other tracks, where I have an option to download a la carte. For me this cut the tracks per artist in half, but I am still downloading almost 7 tracks per artist—even including things like multi-artist compilations which dilute these numbers considerably.

Type Avg Track Size (MB) Albums/Artist Tracks/Artist Time/Track
Project 22.92 0.9519 12.93 4:20
Other 4.72 1.2998 6.80 4:10
All 16.30 1.1330 9.74 4:16

Comparative statistics, lossless tracks vs. entire library

These charts pull out two of the top level stats and put them in perspective. Namely, almost 90% of the volume of my library is in the losslessly ripped tracks, but a third of the tracks are not losslessly ripped—i.e. did not come from CDs. This is interesting because this corresponds to the amount of time that I have been acquiring media non-digitally. One question I have often wondered about the online music market is whether consumers will purchase more music if the frictions are lower. My example suggests that this is not necessarily the case; it seems as though, on average, I am buying as much music as I ever did, but just distributing it differently.

% of library size vs. % of library tracks