MP3 and recording

Praat for Beginners:
Tutorial: What does MP3 do to your recordings?

  1. What is MP3?
  2. What does MP3 do to a recording?
  3. Why compress anyway for speech research?


1. What is MP3?

  • MP3 is a patented digital audio format using lossy compression (i.e. the original precompression sound quality cannot be restored) . Compression is useful for making files smaller for easier storage or transfer across the internet. Using MP3, that compression is at the expense of sound quality.
    • The argument for MP3 is that most people have some loss of the higher frequency ranges and are incapable of hearing a complete hi-fi recording (which has all information intact up to 20kHz).
    • The argument against MP3 for speech research is that you need the cleanest and most complete recording possible, to be sure that you are measuring original speech behaviour rather than some artefact of the recording procedure.
  • The measure of MP3 compression is the bit rate, the amount of information transfer per second, measured in kilobits. The smaller the bit rate, the greater the compression.
    • The usual least compressed bit rate, 192kb/s, sounds like a good player or radio,
      but not hi-fi.
    • Bit rates below 100kb/s sound like small transistor radios.
    • A bit rate of 32kb/s sounds barely better than a telephone.


2. What does MP3 do to a recording?

  • A demonstration of the effect of MP3, comparing several versions of a hifi recording compressed by various amounts.
  • A Bach mass for choir and orchestra was recorded in stereo on location in a large church with audience present, using a laptop computer, an external microphone pre-amp and digitizer (USB), and near hi-fi microphones (rated at 18kHz). The recording was subsequently compressed by MP3 to 192kb/s, 128kb/s, 96kb/s and 32kb/s (using Goldwave audio editor). The same 2 second extract was taken from the left channel of each of the five recordings for LTAS (long term average spectrum). These two seconds included trumpets and very high soprano notes (above 800Hz f0).
  • Here are fifteen second extracts around the LTAS sequences at three bitrates:

The results are shown below.

  The original hi-fi recording with energy right up to 20kHz, with a dynamic range of about 90dB. The sharp spikes are the higher partials (overtones) of the soprano voices (f0 more than 800Hz)
  The usual best MP3 setting (i.e. least destructive) at 192kb/s with energy reaching to some 15kHz, i.e. a loss of nearly 25% of the upper frequency range.
  MP3 compression of 128kb/s with a frequency range up to some 14kHz.
  MP3 compression at 96kb/s, reaching to some 13kHz.
  MP3 compression at 32kb/s (i.e. the most destructive) reaching to 5kHz (i.e. just better than telephone reception, or the speaker of a pocket radio)
  Comparison of the original hi-fi recording and the four degrees of MP3 compression, based on the levels of the soprano overtones
  • These diagrams demonstrate the loss of higher frequencies in compressed recordings.
  • In addition, there are distortions within the signal that are not demonstrated here.


3. Why compress anyway for speech research?

  • Perhaps you have a small digital recorder that only does MP3. Clearly, use the largest bit-rate it offers at 44.1kHz sampling rate. Preferably a minimum of 192kb/s. If it can’t manage 128kb/s or 44.1 kHz sampling, give it to the kids to play with and consider replacing it.
  • Most small recorders available today will record in the uncompressed WAV format as well as MP3 and come with built-in microphones of a quality that matches recorder performance (as always you get what you pay for, cheap or expensive).
  • Most use small flash memory cards (like cameras) available in several gigabyte sizes, and you can have extra cards in reserve, so storage is not critical today.
  • To get some idea of what is currently available, look at the very comprehensive online catalogue at B&H in New York (see their address below and click to the audio department). This is an example, I’m sure you a know of a supplier nearer to home.
  • Make CD quality recordings for your archive and backup, make copies to work with, reserve MP3 compression for extracts to send over the web as needed.


©Sidney Wood and SWPhonetics, 1994-2012