Chapter Five: Digital Audio
4. Nyquist Theorem and Aliasing
In 1928 a Swedish-born researcher for AT&T named Harry Nyquist published a paper entitled "Certain Topics in Telegraph Transmission Theory." In it, he presented a method for converting analog waveforms into digital signals for more accurate transmission over phone lines. If an analog signal were band-limited (i.e., had no frequencies higher than a specific band), it could be captured and transmitted in digital values and then recreated in an analog form on the receiving end. He presented the concept of sampling amplitudes at a specific rate, as described on the previous page. Most importantly, he determined that the sampling rate would need to be at least twice the highest frequency to be reproduced. In 1948, Claude Shannon provided a mathematical proof of Nyquist's theory, entitling us to now call it the Nyquist Theorem. For those interested in the mathematics, a copy of Shannon's proof can be found here.
According to this Theorem, the highest reproducible frequency of a digital system will be less than one-half the sampling rate. From the opposite point of view, the sampling rate must be greater than twice the highest frequency we wish to reproduce*. This frequency is often called the Nyquist frequency. A hypothetical system sampling a waveform at 20,000 cps cannot reproduce frequencies above 10,000 cps. It is important to note that this means ALL frequencies, including higher partials of lower tones. Additionally, nasty things happen when a sampled frequency is exactly at the Nyquist frequency: often a zero amplitude signal will result. This is called the critical frequency.
What happens when a frequency above the Nyquist frequency is sampled and played back? Do these frequencies simply disappear? Unfortunately not. Frequencies above the Nyquist frequency cause aliasing (also called foldover or biasing). The spurious frequencies they produce are predictable, in that they are mirrored the same distance below the Nyquist frequency as the originals were above it, at the original amplitudes.
With a sampling rate of 20,000 cps (therefore a Nyquist frequency of 10,000 cps), a frequency of 12,000 cps will alias at 8,000 cps, 2000 cps below the 10,000 cps Nyquist frequency. In the early day of lower sampling rates (not all that long ago), one could hope these aliased frequencies were weak and would mirror back on other partials making them less noticeable. in the example below, this might be the case if the fundamental were 500, 1000, 2000 or 4000 cps, but not if it were 750,1500, 3000 or 6,000 cps (review partials in Chapter One if necessary), which would not include the aliased tone in its spectra.
Click here to hear the aliasing of a simple glissando from 440 Hz to 44,100 Hz at a sampling rate of 44.1K
The implication of the information above is that the sampling rate is responsible for the frequency response of the system.
Experiment: You can recreate a visual equivalent of aliasing in the following manner. Find a bar with a ceiling fan. After several beers or an equally attractive non-alcoholic beverage, rapidly blink your eyes while staring up at the fan. By altering the rate at which you blink, you should be able to create a false image of the fan blades moving more slowly or even moving backwards. A simpler method would be to watch an old Western, where the "sampling rate" of the film's frame rate creates an illusion of wagon wheels turning backwards.
* The theory, even as expressed here is not exactly what Nyquist said. The accurate description is that the sampling frequency must be twice the bandwidth of the input signal. In audio, we normally include 0 Hz in the frequency band, making it a baseband signal, so for our purposes the optimal audio bandwidth we wish to recreate is 0-20,000 Hz, so we may say that a sampling rate above 40,000 Hz will not cause aliasing.