Managing sound

Praat for Beginners:
Managing sound

  1. Analog and digital sound
    1. Digital recording parameters
      1. Sampling rate and Nyquist frequency
      2. Amplitude resolution
  2. Mono and stereo sound
  3. Praat Sound objects
  4. Computer sound systems
  5. Storing sound in sound files


1. Analog and digital sound

  • Analog sound has been converted (transduced) from atmospheric sound to some other medium, where some other property varies proportionately to the original sound pressure fluctuations. The converted form is said to be analagous to the original sound pressure. In practice, there could be a chain of different analog conversions through microphone, recording medium, playback device etc.
  • The most common example of analog sound has to be the vibrations of the ear drum, a membrane whose motion is an analog of the original sound pressure fluctuations.
  • Typically, electricity is the most common technical medium, allowing the use of electronics for microphones, transmitters, receivers, amplifiers, filters and loudspeakers.
  • More examples of analog sound:
    • The voltage output from a microphone varies proportionately to the original sound pressure fluctuation.
    • The wavy groove of a gramophone record, that is converted on playback to analagous vibrating motion in the pickup stylus, which again is converted to voltage variation by the pickup cartridge.
    • The fluctuating magnetization of magnetic recording tape, that is converted on playback to fluctuating voltage by the playback head.
    • The primary acoustic transducers for speech research are the microphone, and the loudspeakers or earphones that restore the sound back to vibrating air.
  • A digital representation of sound is a sequence of numbers, for example the numerical values of the fluctuating sound pressure amplitude measured at regular intervals (the sampling rate). What actually gets measured is the analogous fluctuating voltage amplitude in the computer sound system, and that is no better than your microphone can deliver.
  • With a numerical representation of sound, calculation is possible. For example, vowel formant frequencies can be calculated instead of filtering for them on a workbench. So far, these calculations are no different to what acousticians have always done with pencil and paper and sliderule. But today, the computer is added as a tool. Monotonous calculations that used to take months to complete can now be done in seconds, and the mathematical models for signal processing are available to all. That is, assuming a computer program has been written, and that you can afford to buy it. Praat is free, thanks to the dedication of its authors and sponsors.
  • The process of transforming analog sound to digital sound is known as analog to digital conversion (or DAC, or A/D).
  • The process of transforming digital sound back to audible analog sound, so that it can be heard again, is known as digital to analog conversion (or ADC, or D/A).
  • Both functions are performed by the computer sound system. The Praat Sound recorders (mono or stereo) control the sound system during conversion, telling it to start and stop and what sound quality to go for.


1A. Digital recording parameters

1Ai. Sampling rate and Nyquist frequency

  • Sampling rate (or sampling frequency), determines how often the amplitude of the analog sound is sampled. This is set before recording.
    • The highest useful sound frequency that can be recorded is half the value of the sampling rate. This highest useful sound frequency is known as the Nyquist frequency. Any spectral activity in the digital recording above the Nyquist frequency is useless, since it is distorted and contaminated by spurious false tones (aliasing). To prevent this happening, anti-aliasing filters (also part of the computer sound system) remove this false activity, so that the Nyquist frequency effectively becomes the upper limit of the recording. For example, if the sampling rate is set at 10kHz (10,000 samples a second) then the Nyquist frequency is 5kHz (5,000 cycles per second). That’s not very good for speech since it misses the typical spectral activity of dental consonants that is mostly higher than 5kHz. To capture all the spectral activity of speech, a Nyquist frequency of around 10kHz is necessary, obtained by setting a sampling rate of 20kHz. Hi-fi recording captures all spectral activity up to 20kHz, requiring a sampling rate of at least 40kHz.
    • For comparison, the upper limit of telephone transmission is around 3.5kHz, of the loudspeaker of a miniature radio about 3-10kHz depending on size and quality, and FM radio transmission anything up to about 15kHz. The upper frequency limit of microphones depends on their design and intended purpose, from telephone microphones at 3.5 kHz to hi-fi microphones at  20kHz or more.
    • In order to take advantage of the recording quality you set, the frequency range of your microphone will have to be better than the Nyquist frequency.
    • A second practical consequence is the relation between the sampling rate and the temporal resolution of the recording. There is always a gap between any two sampling moments, during which a very brief peak in the sound wave can pass unnoticed. So, the higher the sampling rate, the smaller the gaps between samples, and consequently more extremely brief events in the sound wave get recorded.


1Aii. Amplitude resolution

  • Amplitude resolution (quantization resolution, gain resolution) refers to the number of steps for the amplitude measurement. This is usually given as the number of bits in the binary form of the number of steps: 8 bit (256 for ±128), 12 bit ( 4096 for ±2048), 16 bit (65,536 for ±32,768), 24 bit (roughly 16 million, ±8million). This range of bit sizes also represents the progress in A/D converter design over the past 50 years. Your own sound system might offer several bit sizes, or just the best it can do, which is currently 16 bit for most practical situations, although 48 bit to 96 bit are used for professional CD production or playback. The larger the number of steps, the finer the representation of the original amplitude. This is not a parameter you can set in Praat, although other recording programs might let you select from whatever your sound system is capable of. Praat accepts what comes from your sound system, usually 16 bit.


2. Mono and stereo sound

  • Stereo sound has directional information from different mcrophones that point to the left or right and consequently pick up the left aural field or the right aural field respectively. An illusion of space is created when the two fields are kept separate all the way to two loudspeakers, so that the left aural field is heard from the loudspeaker to the left and the right aural field from the loudspeaker to the right. To keep the two fields separate, at least two separate channels are needed, one for each.
  • Mono sound does not have directional information, the microphone picking up sound from anywhere disregarding direction and the distinction between left and right fields is rejected. One channel is all that’s necessary for mono sound but you can have as many as you like.
  • Praat needs mono Sound objects for analysis, but many sound devices (like some mini recorders or computer sound systems) work only in stereo nowadays, so Praat digitizes both mono and stereo sound. If you attempt to analyse a stereo Sound object, Praat will first quietly convert the stereo object to mono by combining the two channels before performing the analysis (you will not see this happen, you might even imagine that Praat is analysing the stereo object as it is).
  • The Stereo recorder in Praat acquires both input channels. The recording can be kept as a stereo Sound object or saved as a stereo sound file. Either can subsequently be split into individual mono Sound objects.
  • The Mono recorder in Praat records one input channel. Traditionally, recorders and sound systems were wired to send mono signals to the left channel. Professional microphone amplifiers usually let you plug your mono microphone into any channel. Some computer sound systems send mono signals equally to both channels. If you’re not sure where your mono signals are, make a stereo recording instead and use Praat to extract the channel that has the best signal.
  • Exactly how your own sound system works, and the sound equipment you have connected to it, will be covered by your own computer and sound system manuals. Standard computer sound systems usually have two-channel input and two-channel output, both serving the needs of both mono and stereo signals.
  • For more details see Mono and stereo


3. Praat Sound objects

  • Praat refers to sound signals as Sound objects, and distinguishes two kinds, depending on whether they are too long to load completely into computer memory, or not.
  • The LongSound objects are read little by little from the hard disk, which imposes a limit on what you can do with them in Praat.
  • Praat records both mono and stereo Sound objects.
  • Stereo recordings from the Sound recorder can be kept as a stereo Sound object or saved as a stereo sound file. The two channels can also be separated into individual mono Sound objects later on if you wish.
  • Praat only analyses mono Sound objects. If have a stereo Sound object for analysis, Praat will temporarily combine the two channels to a mono Sound object. You can also use either stereo channel individually as a mono Sound object for analysis.
  • For more details see Sound objects and LongSound objects


4. Computer sound systems

  • Exactly how sound is handled on your own computer depends largely on the particular sound system installed on it, and whatever equipment you happen to have connected to it (tape or digital recorders, amplifiers, microphones and so on). These details will be covered by your own computer and sound system manuals.
  • Computers running MS Windows usually have a sound system built into the motherboard, or on a separate expansion card fitted in the computer cabinet, or as an external box connected to your computer. Professional sound systems are usually external. External sound systems are least prone to pick up electrical interference from surrounding electronics.
A computer sound system on an expansion card that plugs into the computer inside, leaving the silver strip with connector sockets visible outside, round the back. These connector sockets are for 3.5mm plugs.
  An external sound system,  intended for home use. The sockets for stereo line in, mono microphone in and stereo earphones out are on the front panel. Stereo line out is at the rear.
  • Professional sound systems are often built in a robust metal external cabinet. These usually offer individual connection, selection and control of each channel, for both input and output.
 A professional sound system, USB connection, with phantom voltage for the microphones. The front panel has input selectors, level meters with individual controls, and monitor controls.
  The input connector sockets, individual for each channel, are at the side in this example: XLR for microphone input, 6.3mm for line input, RCA for auxilliary input (i.e. an additional line input).


5. Storing sound in sound files

  • Sound is ephemeral. If you want to hear it again you will need to keep a copy. Gramophone records, magnetic tapes, audio CDs, DAT tapes and memory cards are examples of sound storage media. On a computer, sound data is stored as sound files on a hard disk drive, organized like an archive in folders and subfolders. Just how you organise your sound archive is up to you. If your work is irreplaceable, you should make further copies on removable media that can be stored elsewhere. If you back up sound data on CDs, use a data CD-ROM format. It’s not a good idea to write an audio CD for backup, because the recovery process back to a sound file is not straightforward. And keep a register of your backed up files so that you can find your data again.
  • Praat does not save its Sound objects to disk automatically. Unless you decide otherwise, Praat Sound objects will survive only until you remove them, or close the program, or lose them in a power failure. You will be warned if you have unsaved data when you quit. This will be your last opportunity.
© Sidney Wood and SWPhonetics, 1994-2012