Exercise 10

Exercise 10: Phase Vocoding in SPEAR

A phase vocoder lets you scale duration without affecting pitch and vice-versa. We will use the program, SPEAR, to do this.

Goals

We’re learning how to...

  • open a sound file in SPEAR,
  • use its editing tools to modify the phase vocoder data, and
  • export the changed data as a new sound file.

How to Do This Exercise

  1. Make two relatively short (< 20 seconds) sound files. Choose one that has strong, clear pulsation and one that does not, but has clear pitch. We will transform these files in SPEAR.
  2. Download SPEAR, and install it on your computer.
  3. For each of your two sound files, open it in SPEAR by dragging the file onto the SPEAR icon, or by using SPEAR’s Sound > Analyze menu command. Just accept the default analysis parameters by pressing the Analyze button. This performs a tracking phase vocoder analysis on the file and displays the results as a collection of sine waves.
  4. Play the analysis data by pressing the space bar. Does it sound different from the original sound file? If so, how?
  5. Use the tool bar at the left to modify the analysis data, focusing on the time-scaling capabilities.

    The top six tools are for scrolling the analysis data and selecting portions of it. The rest of the tools are mostly for modifying the selected data.

    See the Suggestions section below for modification ideas.

  6. If you save your analysis using File > Save, you will save an SDIF file, not a sound file. SDIF stands for Sound Description Interchange Format, a file format used in computer music research to store spectral analyses, pitch tracking analyses, etc.

    It’s good to save as an SDIF file, so that you can resume editing later, and so that you have a record of what you were doing to the sound file. But you do not need to submit that for this exercise.

  7. We also want to create a sound file that contains what you hear when you use the play button in SPEAR. Export this file using Sound > Synthesize to File.

  8. Write down or remember what you did to transform the sounds! Be ready to share this in tutorial.

Suggestions

  • If you want to start playing back from somewhere other than the beginning, click where you want to start using the I-beam (text-editing) cursor. Then play using the space bar.
  • Try some different values for Frequency Resolution when analyzing a sound file. The default is 40 Hz, which uses an FFT size of 16384, large enough to smear attack transients in a really noticeable way. Try a higher frequency resolution value, which reduces frequency resolution. Remember that reducing frequency resolution necessarily increases timing resolution, because of the time-frequency tradeoff inherent in the Fourier Transform. This should make the transients in your more rhythmically precise sound file sound clearer.
  • Select portions of the data, and play just those by holding down the shift key before and during pressing the space bar.
  • The tool with the large horizontal arrows lets you scrub the data while playing. Start playback, then click with the scrub tool to take control of the playback cursor. (You have to start playing before this tool will do anything.)
  • Make a selection, and use the Time Scale tool (just above the pencil) to stretch partials. Notice that this is not like the kind of time-scaling we’re used to: the start points of each partial track stay put, while the rest of the partial track stretches or shrinks.

    To try the more familiar kind of time-scaling, hold down the option (alt) key before clicking and dragging with the Time Scale tool.

  • The vertical-arrow tools let you change pitch without affecting duration. The Frequency Shift tool adds a constant offset (in Hz) to each selected partial breakpoint frequency. The Frequency Transpose tool scales the frequencies, preserving the harmonic structure of a sound. There is no formant correction, so you will hear chipmunks and monsters.
  • Another fun thing to do is to use the Edit > Select Partials Below Amplitude command, and playback just the quiet partials. Then use Edit > Invert Selection, followed by delete, leaving only the quiet partials. You can scale their amplitudes up using Transform > Change Amplitude. But be careful: there is no clipping indicator, so you have to use your ears to discover that!

You will find that these sounds generally have a more “metallic” quality than the originals. This is partly because noisy parts of the sound (attacks, breath noise, etc.) must be represented using only sine waves by the Fourier Transform, as if they were periodic components of the sound. There are several ways to improve the representation of noise in the phase vocoder, but these are beyond the scope of this exercise.

Also keep in mind that these files are mono. To get stereo, you must process the two parts of the file separately and put them together again in a sound editor or DAW. Or just process the mono output file with delay, reverb, or another method that decorrelates the two channels.

Submission

  • Be sure you satisfied the criteria listed above.
  • Make a folder and place in it your two original sound files and your two modified sound files, all clearly named. Zip this folder, and submit it in Canvas.

Grading Criteria

This exercise is graded pass/fail. You must submit the exercise by Thursday midnight to be eligible for a pass.