Praat for Beginners:
Making spectral slices (LPC)
- Slice location
- Making the LPC slice: LPC analysis of the signal
- Making the LPC slice: Slice creation
- Creating an LPC slice from female speech
- Printing and saving LPC slices
- You should already be familiar with Making spectrograms and What are formants?
- Click LhereL to go directly to a Quick guide to spectral analysis instead.
- Spectral slices show the amplitude/frequency spectrum at a selected moment in the speech signal. They are useful as aids to comparing local spectral events or measuring spectral properties such as formant frequencies, levels or bandwidths.
- Spectral slices are also referred to as cross-sections by some authors.
- Here is an example of an LPC slice, in its own window (vowel [i])
- There are a number of decisions to be taken before making a slice, such as:
- What is your purpose, why are you making a slice, what are you studying?
- Depending on that, where will you locate the slice? Remember that the speech spectrum is changing continuously, so the location is critical.
- Do you need a single moment in time, or a long time average (LTA)?
1A. How Praat makes spectral slices
- Praat offers two analysis methods for spectral slices: Fourier transform (FFT) and linear prediction (LPC).
- FFT is easier to set up but is sensitive to fundamental frequency and so is less successful for higher pitched voices.
- LPC is not sensitive to fundamental frequency but the settings need to be carefully tuned for each speaker.
- This section describes LPC slices.
- There is only one way to make LPC slices in Praat, the more involved route from the Objects window.
1B. Speech examples used as illustrations
- A male and a female speaker are used to illustrate the procedure.
- The first is a Swedish adult male speaker saying finns det dokumentära inslag
(“there are documentary items”):
The second is a Swedish adult female speaker saying ett forskningsprojekt
(“a research project”):
1C. Getting started
- Start in the Objects window. Load your signal into the Objects list.
- The LPC slice analysis is done on the full frequency range of the Sound object (i.e. its Nyquist frequency).
- The frequency range of the examples is 11025Hz.
- You will need to decide how many formants are expected in that frequency range. For men expect one formant for each Khz of the frequency range, formants by women are around 10% higher, say one for every 1100Hz. That means 11 formants are expected for the male example, and 9 or 10 for the female example.
- If you do not know the frequency range, select the Sound object in the Objects list and click the Info button at the bottom of the Objects window. You will find the sampling frequency there. Remember the Nyquist frequency is half the sampling frequency.
- However, you will need to do another analysis first, to help decide precisely where the slice is to be located. This is done by examining a spectrogram of the signal (see next section), and then doing LPC formant tracking around the likely location. This task will also have to be tuned to the speaker, by setting a suitable number of formants in the desired frequency range of the formant tracking.
- Armed with this information, you will be ready to make your slice.
2. Slice location
- The first step is to decide precisely where in the signal you want to make the slice, and the best way to find out is to start by examining the waveform and spectrogram of the signal in the Sound editor.
- The precise location for an LPC slice is not quite so critical as it is for a wideband FFT slice, which is sensitive to its location within a glottal period. Nevertheless, you should still aim at the necessary precision for a result that answers your task, which will dictate criteria for locating the slice. For this example, the aim is to get a most typical slice of a vowel, i.e. the part that is least affected by articulations for neighbouring consonants, somewhere between the onglide and the offglide.
- A slice will be taken from the male speech, during the first vowel [i] in [..(f)ins..]. The first step is to zoom in to that sequence.
- Examine a spectrogram of the area of interest:
- To the left, the waveform and spectrogram zoomed in to [..(f)ins..]. To the right the formant tracks overlaid on the spectrogram.
- The spectrogram shows F1 and F2 rising from [f] to [i], then falling towards [ns]. The part where this [i] is least affected by the consonants is apparently where F1 and F2 are highest, just before they turn down again as the speaker moves on towards the next consonant. A mouse click here puts a red line up through the spectrogram and waveform, and writes the time of this location from the beginning (green arrow). Note that this is a rough and ready guess, the next glottal pulse looks like an equally good position.
- For this example, formant tracking was set up for 6 formants within 6.5kHz for this speaker, in order to get satisfactory formant tracks for F1 to F4 on the spectrogram. Selecting a small region around the location just found (left above), and clicking Formant list in the Formant menu (Sound editor), the following table comes up for inspection:
- The info table reports the calculated formant frequencies in successive analysis frames. I’ve coloured the row where F1 was highest (compare it with its neighbours), indicating the moment immediately before the mouth opening began to close down for the next consonant. F2 was highest in the previous frame (the row above), F3 was highest in the next frame (the row below). Remember the time of this frame (arrowed), this is the location for the slice according to our criteria (where the vowel spectrum is least affected by the neighbouring consonants).
3. Making the LPC slice: LPC analysis of the signal
- The LPC slice is made in several steps, working from the Objects window.
- The first is to do a linear predictive analysis of the whole signal.
- Select the Sound object in the Objects list, and click Analyse spectrum:
- Then look down the list to To LPC …. Several standard methods are offered: autocorreleation, covariance, burg and marple.
- In Praat, the burg procedure allows you to set the frequency range in advance; it is also the method that Praat uses for its Formant tracking procedure. Consistency perhaps suggests using burg for all your LPC in Praat.
- The other methods are preset to analyse the full range up to the Nyquist frequency, but you get a chance later on to zoom your slice to a shorter range.
- For this example, autocorrelation was chosen (for no special reason). The Sound to LPC dialog for that method appears:
- Fill in the analysis parameters:
- Prediction order: the number of filter coefficients. This is a critical setting – the absolute minimum setting is twice the number of formants in the frequency range of the signal (but a better setting is twice the number of formants plus at least 2); in this example 11 formants are expected in the frequency range of 11025Hz, so a good setting here will be at least 22+2, and the number actually entered in this instance was 24. If you are not sure, you can make a series of analyses by adding 2 more coefficients each time, and compare the resulting slices; below the ideal setting you should see a new formant appearing for roughly each additional pair of coefficients; above the ideal setting you will see no further change after adding more coefficients. There is an optimum setting for each individual speaker.
- Analysis width: the amount of data from the signal that is needed for the computation. A good setting includes at least two glottal periods; the default setting 0.025s is fine for most adult male voices and there is no real need to make it smaller for female voices. Leave it as it is.
- Time step: the temporal resolution of the analysis, i.e. the time interval between successive computed analysis frames. The default setting of 0.005s gives 200 frames per second, which is adequate for most work. Leave this as it is unless you need a coarser or finer analysis.
- Pre-emphasis from: the effect is to high-pass filter the signal by +6dB/octave, enhancing energy at higher frequencies; the number entered here is the frequency from which the filter will be applied. This is not a parameter to be played around with, except that it can be turned off by entering a number larger than the Nyquist frequency, e.g. in this instance 12000 or more. Preferably, just simply leave the default setting as it is.
- Click OK and the signal is analysed.
- When the computation is complete after a while, an LPC object is placed in the Objects list, with the same name as its Sound object. The name can be edited.
3. Making the LPC slice: Creating the slice
- In the Objects window, select the LPC object and click To spectrum (slice):
- The LPC to Spectrum dialog opens:
- Enter parameters as necessary:
- Time (seconds): This is the location of the slice in the signal, for this example the number previously obtained from the formant track table
- Minimum frequency resolution (Hz): This is the frequency resolution of the spectrum diagram. The default 20 is usually adequate
- Bandwidth reduction (Hz): This makes formant peaks appear narrower and sharper. It might help make poorly defined formants more obvious, but should normally be left at the default 0
- De-emphasis frequency (Hz): This removes any pre-emphasis that was previously introduced in the settings for the LPC analysis of the signal. This parameter should have the same value as you had for the LPC analysis, and usually left at 50 in both places
- When you are ready, click O.K. and a Spectrum object is placed in the Objects list, with same name as the LPC object. This can be renamed in the Objects window if you wish. To view the slice, select the Spectrum object in the Objects list and click View&Edit. The slice appears in a new window:
- This particular view demonstrates zooming to a frequency range of 0-5000Hz. The original slice, viewing the full range to the Nyquist frequency, is at the top of this page.
4. Creating an LPC slice from female speech
- The only special consideration when making LPC slices for female speakers is to remember that their formant frequencies can be 10% or more higher than men’s, i.e. expect one formant every 1100Hz, roughly. The same procedure outlined above is followed, remembering to optimize the LPC analysis to the individual speaker.
- For this example of female speech, there were 9 formants within 11025Hz, so a prediction order of 20 coefficients was found to be adequate for the LPC analysis.
- This example is the final vowel from the sample of female speech, where the voice fundamental reached more than 300Hz. The spectrogram was viewed in the Sound editor to determine the location for the slice, using the formant tracking method described above and with tracks fully optimised for this speaker.
- This vowel was preceded by a voiceless retroflex sibilant, and F2 is seen rising into the vowel, continuing to rise towards the voiceless velar stop that follows. F3 falls continuously towards the velar stop. Neither F2 or F3 appear to be helpful for locating the slice. However, F1 rises away from the retroflex sibilant and then falls towards the velar stop, (red arrow in the next table).
- The green arrows show the frame where F1 is highest.
- The listing also confirms F2 rising continuously and F3 falling continuously.
- Using this time location, the LPC slice is produced:
5. Printing and saving LPC slice images
- LPC slices are printed or saved from the Picture window, where they are transferred from the Objects window. How to do this is described in detail elsewhere, but a necessary first step is to get the slice into a Spectrum object in the Objects list. This was done above.
- For this example, the male [i] will be printed.
- Go to the Picture window and mark out the area that the LPC slice is to occupy. Position the mouse pointer where you want the top left corner, then hold the left mouse button down and drag the pointer to where you want the bottom right corner. Then release the mouse button.
- Then go back to the Objects window, select the Spectrum object, and click Draw:
- The Spectrum draw dialog appears, where you can adjust settings that affect the appearance of the image
- Frequency range (Hz): Enter the range you want to draw, or leave the default values and see the full range. This example will show 0-5000Hz.
- Minimum power and Maximum power: The end values of the sound pressure scale in dB. Enter the scale values you want to see, or leave the default values (Praat will fix a scale for you). This example sets a tidy rounded-up scale from 30 to 85dB.
- Garnish: Adds a box, scales and labels. Leave unticked to add nothing (the Picture window has tools for adding text and for drawing).
- Then click O.K. The slice appears in the selected area in the Picture window.