FFT slices

Praat for Beginners:
Making spectral slices (FFT)

  1. Preliminaries
    1. How Praat makes spectral slices
    2. Speech examples used as illustrations
    3. Getting started
  2. Parameter settings
    1. Wideband and narrowband slices
    2. Windowing function
    3. Dynamic range
  3. Where to make the slice
  4. Creating an FFT slice in the Sound editor
  5. Printing and saving FFT slice images

Top

1. Preliminaries

  • You should already be familiar with Making spectrograms and What are formants?
  • Click LhereL to go directly to a Quick guide to spectral analysis instead.
  • Spectral slices (or cross-sections) show the amplitude/frequency spectrum at a selected moment in the signal. They are useful as aids to comparing local spectral events, or measuring spectral properties such as formant frequencies, formant levels or formant bandwidths.
  • There are a number of decisions to be taken before making a slice
    • Why are you making a slice, what are you studying?
    • Depending on that, where will you locate the slice? Remember that the speech spectrum is changing continuously, so the location is critical.
    • Do you need a single moment in time, or a long time average (LTA)?
    • Do you want to see formant peaks or voice harmonics?
  • The procedure begins with making a spectrogram to help you find your way about in the speech signal and find the location for the slice; the spectrogram can be optimized for clarity and to tune it to higher pitched voices. Then place the cursor (or make a selection for LTA) where the slice is to be taken, zooming as necessary for precision. Then make the slice, which will appear in a new window.
  • Here is an example of the FFT slice, in its own window (vowel [i])
 

Top

1A. How Praat makes spectral slices

  • Praat offers two analysis methods: FFT and LPC.
  • FFT is easier to set up but is sensitive to fundamental frequency and so is less successful for higher pitched voices.
  • LPC  is not sensitive to fundamental frequency but the settings need to be carefully tuned for each speaker.
  • This section describes FFT slices.
  • Praat offers two routes to FFT slices: a simpler procedure in the Sound editor and a more involved procedure in the Objects window. This Beginner’s manual describes the Sound editor procedure. The initial setup is the same for both procedures – having a signal to work on, understanding the parameter settings, deciding where to make the slice etc.

Top

1B. The speech examples used as illustrations

  • A male and a female speaker are used to illustrate the procedure.
  • The first is a Swedish adult male speaker saying  finns det dokumentära inslag
    (“there are documentary items”):
  • The second is a Swedish adult female speaker saying ett forskningsprojekt
    (“a research project”):
 

Top

1C. Getting started

  • Start in the Objects window.
  • Load your signal into the Objects list, select it and click View&Edit. The Sound editor opens, showing the waveform of your signal along with any analyses that you had used on the previous occasion you had the Sound editor open (Praat remembers your analysis selections from one session to the next).
  • It is extremely helpful to have a spectrogram on view, to help you find your way about in the speech signal, follow the formant movements, and find the precise location for the spectral slice.
  • If the spectrogram is not already visible, open the Spectrum menu and tick Show spectrogram:
    • Better still, open the View menu and select Show analyses, where you can show the spectrogram and hide any other analyses that happen to be open, all in one place:
 

Top

2. Parameter settings

  • The Window length setting for spectral slices is the same as for the spectrogram. If you change that setting now for the slice, the spectrogram will be similarly modified too.
  • To check the settings, open the Spectrum menu in the Sound editor and select Spectrogram settings. The following dialog appears, showing your current settings. Make any adjustments you need (the parameters are explained as they arise below).
  • To restore the default settings, click Standards.
  • Note that the View range is only for the spectrogram. Set this range to to show the detail you want to see there.
  • The frequency range of the slice is always that of the signal being analysed, i.e. up to its Nyquist frequency. Once you see the slice in its own own Spectrum window, you can modify the visible frequency range by using the zoom settings in the View menu there.
  • Dynamic range in the Spectrogram settings dialog is also uniquely for the spectrogram, hiding or showing background noise. The Dynamic range for the slice is set independently of the spectrogram from the View menu in the slice’s own Spectrum window, its effect there being to expand or shrink the dB range of the visible sound pressure scale.

Top

2A. Wideband and narrowband slices

  • Wideband slices show the formant structure at the selected time location, narrowband slices show more detailed spectral structure (which will be the voice harmonics if you are examining a voiced segment).
  • This works the same way as for spectrograms. The filter bandwidth is set at Window length (s) in the Spectrogram settings dialog box. The unit for this setting is seconds and not Hz, following customary procedure, and is the time constant of the bandpass filter rather than its explicit frequencies.
  • For wideband slices, a window length of 4ms or 5ms will generally work well (but remember Praat wants it to be entered in the settings box in seconds, i.e. 0.004 or 0.005).
  • The default setting 0.005s is fine for most adult male voices, but might be too long at slice locations where the voice fundamental is relatively high (especially for men with high pitched voices, for most women, or for children). In this situation the voice harmonics might intrude. If this occurs, a slightly smaller time constant (0.004 or 0.003s) might be more suitable. Experiment to find the best setting for each voice and each slice location. Keep notes.
  • For narrowband slices, set Window length to 0.03s, which will be fine for both male and female voices.
  • Here are examples of wideband and narrowband slices from [i]-like vowels by the male and female speaker respectively:
FFT slices from the male speaker (above) and female speaker (below), wideband
(left) and narrowband (right). Both are taken from an [i]. For each spectrum, the
horizantal axis shows frequency 0-5000Hz, and the vertical axis shows sound
pressure (0-60dB). The wideband slice picks out the formants. The female
speaker has fewer formants in this 5000Hz range (due to a shorter vocal tract).
The narrowband slice picks out the harmonics. The harmonics in the female
example are further apart (reflecting the higher fundamental frequency of her
voice) and consequently she has fewer harmonics in each formant.
  • The female example contains an instance of the difficult situation presented by a high fundamental frequency. In the final vowel, the fundamental rose to 326Hz and a wideband slice taken there with the default 0.005s window showed intruding voice harmonics that confuse the identification of the formants:
  • What is happening here is that the harmonics are even further apart now, and the 0.005s filter is picking them out here and there, just like a narrowband analysis would. Clearly, the fundamental frequency is so high in this example that some harmonics look like formants. F1 is definitely ambiguous. F4 looks like three formants, not one. A slightly smaller setting of Window length will sometimes help exclude such intruding harmonics. Reducing the window from the default 0.005s to 0.003s gives the following:
  • This is an improvement, but there are still ambiguities. The first and second harmonics still intrude and make it difficult to identify F1 properly, and the eleventh and twelfth harmonics still intrude and make F4 difficult to identify properly. This situation is a typical illustration of the poorer performance of FFT with a higher pitched voice. Note that an LPC slice is usually less ambiguous when properly tuned to the speaker:
 

Top

2B. Windowing function

  • The windowing function is selected from the dropdown menu at Window shape in the Advanced Spectrogram settings dialog. The default, and recommended, window function is Gaussian.

Top

2C. Dynamic range

  • The dynamic range determines how much background noise is allowed to intrude and show up in the spectrogram. This is set at Dynamic range in the Spectrogram settings dialog, the default value being 50dB. Reduce this number in 3dB steps to exclude intruding visible noise energy from the slice.

Top

3. Where to make the FFT slice

  • You select a location for the slice by mouse-clicking in the Sound editor window. Praat then centres the analysis window on this location (the analysis window is what you set at Window length in the Spectrogram settings). This means that even a momentary slice has a duration, that of your set analysis window length. Window lengths of 0.003 to 0.005s will be shorter than 1 glottal period. A window length of 0.03s will include 2 or 3 glottal periods.
  • It is good practice to zoom in around the slice location in order to position the cursor more precisely.
  • For now, while you are practising the procedures and parameter settings, it might not matter where you make a slice. But for serious work, the location of the slice will be defined uniquely depending on whatever is dictated by your task.
  • Here is an example, to the left a brief portion of a signal showing a sequence of a vowel and two consonants [..ins..] with a small selection during [i], and to the right that same selection zoomed to one glottal pulse, allowing the cursor to be positioned more precisely:
  • Remember also that the waveform, and hence the spectrum, of a speech signal is continuously changing. Your slice will be misleading if you miss the intended location by a few milliseconds, or if your intended location is not properly defined.
  • The energy distribution also changes within a glottal pulse. With low pitched male voices (glottal pulses of 10ms or more), it will be possible to make a number of wideband slices within one glottal pulse and each will be different. The energy you see in voiced parts of a spectrogram (such as to the left in the previous example) comes from the stronger first part of each glottal pulse, hence the typical vertical lines on the spectrograms. The lighter gaps between these vertical lines portray the weaker endings of each glottal pulse.
  • This is illustrated in the next example. The next diagram shows the zoomed waveform of part of the [i]-like vowel, with one period shaded, representing one glottal pulse. The two vertical lines A and B mark the locations of two wideband slices 5ms apart so that they will not overlap. A is in the stronger early part of the pulse, B is in the weaker latter part of the pulse.
  • The next diagram compares the two broadband slices (5ms Gaussian) taken at the positions A and B respectively, illustrating this difference:
  • This demonstrates (i) that the spectrum varies within a glottal pulse, (ii) wideband slices can pick up that variation, (iii) an imprecision of a few milliseconds slice location can pick up that variation. This emphasizes the need for careful definition of what you are looking for when making a slice, careful definition of a slice location in order to find what you are looking for, and careful selection of parameter settings in order to see what you are looking for.
  • What this comes down to for a momentary wideband slice is: find the glottal pulse that matches your criteria best, then centre the slice over the strongest part of that pulse.

Top

4. Creating an FFT slice in the Sound editor

  • Load your signal into the Objects list in the Objects window
  • View&Edit it in the Sound editor
  • Show the spectrogram, optimize any analysis settings as necessary
  • Position the cursor precisely where the slice is to be taken, zooming as necessary
  • In the Spectrum menu, click View spectral slice:
  • The slice appears in its own window.
  • At the same time a Spectrum object is placed in the Objects list in the Objects window. Its name is the same as the Sound object with the addition of the time point in the signal:
 

Top

5. Printing and saving FFT slice images

  • FFT slice images are printed and saved from the Picture window
  • There is no draw slice command in the Sound editor or the spectral slice’s own window. The image is drawn to the Picture window from the Objects window.
  • The procedure is described in detail elsewhere. This section shows a quick example.
  • Recall that the View spectral slice command also placed a correspondingly named Spectrum object in the Objects list.
  • Start by marking out the area that the FFT slice is to occupy in the Picture window. Position the mouse pointer where you want the top left corner, then hold the left mouse button down and drag the pointer to where you want the bottom right corner. Then release the mouse button. Or, just accept the area that Praat has already marked out.
  • Then go back to the Objects window, select the Spectrum object, and click the Draw button:
  • The Draw spectrum dialog appears where you can adjust settings that affect the appearance of the spectrum image, then click O.K. and the slice appears in the selected area in the Picture window:
  • The commands for printing, or saving image files, are in the File menu.
Top
©Sidney Wood and SWPhonetics, 1994-2012