Ripping your CD collection part 1 – audio encoding 101

Sixteen years after the birth of MP3 and nine years after the launch of the iPod, many people I know still persist in using audio CDs to play music in their home or car. Physical stores selling physical CDs still exist even though they are hopelessly uncompetitive compared to their online counterparts. Given the compelling advantages of modern audio encoding formats over the humble CD, I find the persistence of the CD audio format more than a little surprising.

Perhaps people have not ripped their CDs and put them in storage due to the perceived complexity or diversity of audio formats and the labour-intensive ripping process. The process of converting a large CD collection can be daunting and the risk of getting it wrong and having to redo it all again looms large. A google search reveals millions of pages on encoding formats, acronyms, complex technical terms, various tools claiming to be “the best” and lots of silly arguments between so-called audiophiles. Perhaps that’s why most put it in the too hard basket and continue to buy and use CDs?

In this first of a two part series I describe the history and current state of audio encoding techniques and dispel some of myths and common arguments. In the second part I will explain in detail the actual process of ripping a large CD library and sharing it with others on your home network.

Although digital audio recording and engineering pre-dates the advent of the Compact Disk in the 1980s, it was this media format that popularized it to a mass market outside of recording studios. The designers of the CD Audio standard (known as the “redbook” standard as it was actually printed on red paper) came up with a specification that neatly fits up to 74 minutes of digitised high quality stereo music on a single laser etched disk. This was over 20 minutes more than the combined length of a two sided vinyl LP album. As a bonus, the digital format gave much higher dynamic range (the difference between peak music volume and background noise) than vinyl and was not as prone to dust and scratches.

History shows that the CD audio format quickly became the preferred medium for music distribution and the whole era of digital media was born.

Through the 1980s and early 1990s the recording industry flourished under the new format and many people (including me) “upgraded” their old vinyl LP album collection to CD. I eventually sold my complete LP collection in 1992 at a garage sale. Some audio engineers and even pop groups specialised in getting the most from the new digital format.

Even with the rise of the PC, the sheer volume of data supported by the CD audio format – 800MB per disk – ensured it reigned supreme and impervious to all challengers. In 1994, PCs cost over $2,000 ($3,000 in today’s terms) and few had more than 1GB of total storage. Copying a CD would take almost an entire expensive hard disk – and as you couldn’t move it anywhere, what would be the point?

In 1991 a group of experts were trying to come up with an efficient way to digitally encode video and audio to get the best quality results using the smallest amount of data. It was important work as it was to set the standards for what was to eventually become the format used by VCDs, Digital video cameras, DVDs and Digital TV. International standard 11172-5 was the result. The full formal name for the audio component of this standard is the “Motion Pictures Engineering Group 1 Audio Layer 3” but of course this standard is much better known simply as “MP3”.

In March 1994 a reference implementation (a sample computer program)  was included on the document CD of the standard. In July of that same year the first audio encoder program hit the bulletin boards and fledgling internet. The encoder magically converted 800 MB audio CDs into files with a “.mp3” extension (mp3 files) 1/10th of the original size. Given the size and cost of digital storage and bandwidth at the time, this was close to miraculous and a crucial factor in the new format’s explosion.

All of a sudden, the CD format was vulnerable to large scale piracy – tracks and even whole albums could be exchanged and unlike audio tape, mp3 files could be copied perfectly and reproduced without any loss in quality. Napster and other elicit services came and went. Lawsuits were laid and the format became synonymous with piracy. But through it all, the MP3 format prevailed. Now you can legitimately download MP3 tracks from numerous online stores at a fraction of the cost of the physical version.

Compression vs quality
As a music listener, you don’t need to know how MP3 works but you do need to know that this magic compression comes at a price; and that price is quality. One of the methods the MP3 algorithm uses to make audio files so small, is to literally ignore some information (i.e. music) during the conversion process. What’s so cool about the standard is it can work out exactly what information can be discarded with little apparent change in quality as detected by the human ear.

The problem is that this coolness has its limits, the more information you throw away, the more likely it is your ear will notice. So the more you compress audio (the smaller you make the file) the more likely it is you will notice a degradation in quality.

Compression is measured in kilobits per second (kbps) – a kilobit being 1,024 bits of information (i.e. a zero or a 1). In general, the less bits/second the less information is being retained so the higher the compression and the lower the resulting quality. The MP3 standard supports bit rates from 32 kbps to 320 kbps. 32 kbps makes for really small files but music encoded at this rate sounds like its coming over a telephone or old AM radio. Files encoded at 320 kbps are ten times larger but have been proven to indistinguishable to the original source. In the early days of MP3, most people settled on 128 kbps as the standard bit rate for MP3 music files – a nice compromise between size and quality. Most files shared on early peer-to-peer services were encoded at 128 kbps.

As the quaility of computer audio components improved and storage costs decreased, music files at 160, 192, 256 Kbps became more common. Most commercial services now offer MP3 files at the highest standard bit rate of 320 Kbps to ensure the highest quality audio.

Constant vs variable bitrate
So far I have talked about files that have a constant bit rate (CBR) – the file uses the same bit rate for the entire contents. However, the standard allows for a single file to contain multiple bit rates. This is known as variable bit rate encoding or VBR.

The advantage of VBR encoding is the amount of information used for storing the compressed music can change depending on the complexity or detail of the music source. So for complex parts, 320 Kbps can be used and for less complex parts, lower rates can be used. Remember lower bit rates means less space so VBR encoding allows for the ultimate compromise is size vs quality – very high quality, very small audio files can be produced.

One of the perennial arguments you will find on the internet is whether VBR is better than CBR and what bit rate is better than others and whether VBR still has any relevance when storage is so cheap. I do not want to rekindle the flame wars here but just for clarity, properly encoded 320Kbps MP3 has been proven in blind tests to be indistinguishable from the original source. Further, a well encoded VBR file at a high maximum bit rate will be at a similar high quality. In short, VBR encoding remains the method of choice for high quality audio where storage or bandwidth is a consideration – think a mobile phone or iPod.

Sample rates
The sample rate defines the number of times a second the audio digitizer will sample (look at) an audio signal source. In general, the more times a digitizer samples music every second, the more accurate the resulting digital signal will be.  Sample rates are quoted in Hertz (Hz, cycles per second). The most common sample rate is 44,100 Hz – the rate used for CD Audio. As it takes at least two samples to accurately sample a given tone this rate can sample frequencies up to 20,000 Hz – more than most human ears and better than most retail audio equipment.

The advent of the DVD format introduced a new sampling rate – 48,000 Hz. Audiophiles sometimes use sample rates up to 96,000 Hz but you won’t find many of these types of files – or equipment that plays them.

Bit depth
Bit depth is the number of bits used to store information for each sample of music. The bit depth determines the number of graduations (levels if you like) a digital format can use to represent the relative volume of a single sample. It is measured in bits and works like bit-depth in graphics – the higher the bit depth in graphics the more subtle (and accurate) the colour reproduction. In audio, the higher the bit depth the more accurate the relative sound levels can be recorded.

The more bit depth, the bigger the difference between the loudest sound and the background noise level – this is known as dynamic range. The higher the dynamic range, the better but as in most things its a compromise, the more bits used per sample, the bigger the overall file and the more CPU grunt needed to encode and decode.

Common audio bit depths are 8, 16 or 24. The CD Audio format used 16 bits so that has become the defacto standard for all digital audio encoding.

Other formats
For a while, the MP3 audio format ruled as the defacto standard but with the advent of the iPod and iTunes, Apple attempted to popularise a new audio encoding standard called Advanced Audio Coding (AAC). It was formally standardised in 1997 as ISO 13818-7 and added as an extension to the MPEG2 standard.

To complicate matters, Apple chose to use a file suffix of M4A (non-protected) and M4P (protected) to designate AAC encoded files. Even though there is no such thing as MP4 audio (MPEG4 is a video standard that supports multiple audio formats), that’s what people started calling it and confusing it as a successor to MP3. Apple did little to discourage this impression and made much of the fact that the new standard had slightly better quality results at lower bandwidths than MP3. Which was technically true – but in reality for most people the difference is marginal.

Microsoft weighed in with their Windows Media Audio format (WMA). They also claimed better quality at lower rates than MP3.

And then came even more formats – too many to mention here – but the problem with all these formats (including WMA and AAC) is that they never became the new “standard”. AAC compatibility is limited to the apple device family and WMA support is limited to the Microsoft ecosystyem. MP3 remains king and is supported by everyone and every device – even my car supports MP3.

So MP3 remains the format of choice for portability.

Lossy vs Lossless
With the huge increases in bandwidth and data storage, the relative size of files became less of an issue and attention started turning to improved audio quality.

All the formats mentioned above are known as “lossy” formats, i.e. they deliberately lose information in the encoding process – information that is lost forever and can never be replaced. An alternate solution is a lossless format – one where the file is still compressed but no information is lost in the encoding process. When the encoded file is decoded and played back it is always identical to the original source – by definition.

One of the first implementations of a lossless audio format in 2001 was FLAC – the Free Lossless Audio Codec. FLAC files are generally 30-50% smaller than uncompressed files but with no loss in quality. For this reason FLAC is generally used by audiophiles and for archiving CD audio collections.

Rather than support an open, free standard, Apple and Microsoft launched their own proprietary lossless audio standards (ALAC and WMA Lossless respectively) but again neither really caught on outside their ecosystems.

While it has become the defacto lossless standard, FLAC still suffers from the same lack of support as WMA and AAC. Outside of PCs, very few mainstream devices support the format. Worse, as FLAC has been around for almost 10 years, this situation is unlikely to change any time soon. So when it comes to audio formats you are left with a choice of quality vs portability.

Meta data
Meta data is information stored in the digital music file that describes the content of the file e.g. Artist, file name, track number, album, year etc. The standard for storing this information in MP3 (and other file formats) is called ID3. Its a pretty “loose” standard meaning there are different interpretations of the standard by application and device providers so it doesn’t work as universally well as it probably should. Complicating matters further is the existence of different versions of the standard – commonly known as v1.1, v2.3 and v2.4. In short – its a mess.

Data is stored in a format known as “tags” so you might hear about “meta tags” or “ID3 tags”.

Fortunately the basic information about a track (title & artist) has not changed so most devices can at least decode and display that much.

It is important to note that its the meta data stored in a track file that is used when it comes to cataloguing files in a music library – not the file name or folder structure. It is therefore imperative to create and maintain your meta data accurately.

There are plenty of tools available for the bulk maintenance of meta data. I recommend Foobar 2000 and MediaMonkey.

Freedb & AccurateRip
Freedb and AccurateRip are internet based databases used to assist the process of ripping CDs.

Freedb provides meta data  to audio CDs during the ripping process. The CD audio format did not allow for naming of tracks and artists so the ripping process must rely on freedb to get this information. Nearly every ripping tool supports freedb – saving you lots of typing!

AccurateRip can be used to verify the accuracy of a specific rip. Using some funky maths, your ripping software can compare the results of your rip against others who ripped the same song on the same CD. The logic being if your result matches others, than its most likely accurate.

In the next part of this series I will discuss the process of ripping your entire CD collection an building your music library.

2 comments to Ripping your CD collection part 1 – audio encoding 101

  • George

    Dear Ben!

    Thanx for the great background-story on encoding. I have a +4000 CD-collection that I would like to rip in a lossless format that would work on my Macs/Itunes/Ipod. Is ALAC the only option? And is there any hardware that can rip all these CD:s faster than one by one?



    • Ben


      Thanks for the comments.

      If you intend on staying in the apple domain then I would recommend ALAC as your best option as its the only lossless format supported by these devices. You will have problems if you go outside the apple domain so you may also want to consider encoding into 320K mp3 at the same time and maintaining two libraries… Alternatively you can always transcode to MP3 as required using itunes…

      As far as automating such a large rip process, I am sorry I cannot be much help. I only had 200 or so to do and that took about a week on and off so I cannot imagine what 4000+ would take. Looking at hydrogenaudio there seems to have been some very expensive hardware solutions around a few years back but I doubt they are still available. Given the price of USB CD writers it would be possible to rig up 4 or 5 drives to rip in parallel but you’d still have to manually load the disks…

      BTW you are going to need a bit storage – 4000 CDs in ALAC will be about 800GB. That doesn’t sound so bad until you start considering how you are going to back it up. You are going to need a few 1TB USB drives…

      Good luck with it. I’d be interested to hear how you go.


Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>