Time-frequency analysis with the continuous wavelet transform
W. Christopher Lang and Kyle Forinash Natural Sciences Division, Indiana University Southeast, New Albany, Indiana 47150
(Received 20 June 1997; accepted 6 February 1998)
The continuous wavelet transform can be used to produce spectrograms which show the frequency
content of sounds (or other signals) as a function of time in a manner analogous to sheet music.
While this technique is commonly used in the engineering community for signal analysis, the
physics community has, in our opinion, remained relatively unaware of this development. Indeed,
some find the very notion of frequency as a function of time troublesome. Here spectrograms will
be displayed for familiar sounds whose pitches change with time, demonstrating the usefulness of
the continuous wavelet transform. ? 1998 American Association of Physics Teachers.
? Imagine the following engineering problem: Develop soft- - i ω ( Tf ( ω , t )= g ( ( - t ) f ( ( )e d ( .Jware which from recorded music will produce correct sheet - ?music notation for that music. Thus if the note ‘‘C’’ is heard
(a pitch of 262 Hz) for one-fourth of a second, followed by \ Tf ( ω , t ) \ for each ( t , ω ) (as a density plotting) Plotting
the note ‘‘A’’ (440 Hz) for another quarter-second, the soft- would produce a spectrogram which would show (at least
ware would produce a plotting indicating 262 Hz for 0 三 t roughly) the frequency content of the signal as a function of
time. It could actually produce recognizable ‘‘music nota- 三 0.25 and 440 Hz for 0.25三 t 三 0.5. We may assume the
tion’’ for recorded music. This transform is known as the input music would be presented in numerical form. (Thus, Gabor transform, after Dennis Gabor, who introduced it in for example, a monaural sound sampled at 9000 Hz might be the 1940s. given as 9000 eight-bit numbers per second.) How then The Gabor transform has a subtle limitation which the would our software accomplish this task?
continuous wavelet transform will be introduced to address. The most basic technique for determining the frequency
The limitation is this: The ‘‘width’’ of the window function distribution of a signal f ( t ) is the Fourier transform. This
ˆg is constant. A narrow window (h small) will localize is given by the familiar integral transformf ( ω )? it ω-higher pitches both in frequency and time nicely, but lower = J f ( t ) edt . If we wish to determine what pitches or - ?(h pitches will be ‘‘blurry’’ in frequency. A wider window frequencies were audible during the time interval 0.25三 t larger) will determine lower pitches (bass notes, say) better, 三 0.5, we could perhaps compute the Fourier transform of f but the higher pitches will be ‘‘blurry’’ in time. See Fig. 1(a) restricted to that time interval. That is, we could simply com- and (b) for plottings of an artificial signal which demonstrate pute that integral from t 0.25 to t 0.5 instead of t ?to ===-this effect. ?. This will work after a fashion (although we cannot expect In the early 1980s, Morlet and Grossman modified the the Fourier transform to display too narrow a peak at ωequal Gabor transform to produce the continuous wavelet trans- to 440 Hz, since the integral is not computed over very many form. The idea is this: Change the width of the window cycles). More generally, we could select a value h (repre- function according to the pitch of the note being considered. senting perhaps one-half of the duration of the shortest notes The new transform could be given by of the music) and compute for each t and ωthe integral ? - i ω ( Wf ( t , ω )= f ( ( )g (( ( - t )ω )e d ( .J t 十 h - ?- i ( ω f ( ( )e d ( . T f t , ω )= ( Jh t - h Notice the insertion of the factor ωinto the window function.
Actually, the continuous wavelet transform is more properly So T f ( t , ω ) would represent in some sense the energy of the given as h signal at frequency ω in the neighborhood of time t . This ? transform is known as the ‘‘short-time Fourier transform,’’ f ( ( )w (( ( - t )ω )ω d ( , (1)Wf ( t , ω )=J - ?and it has been the traditional technique in signal analysis for tracking pitches or frequencies as they change over time. where now the ‘‘wavelet’’ function w combines the complex There is, however, an important limitation with the exponential and the Gaussian; w is typically given as short-time Fourier transform: In restricting the integral to the interval t - h to t 十 h , we are in effect chopping the 1 2 2it - t /2h signal sharply at those times. This has the effect of intro- w ( t )= e e . (2)Jh 2 τ ducing lots of high-frequency ‘‘information’’ into the trans- form; we get a Fourier transform that is more spread-out than This is called a wavelet function because it is oscillatory but necessary. We can mitigate this problem to some extent by has the limited support provided by the window function g . introducing a smooth ‘‘window function’’ g . We could ar- [The extra factor of ω in integral (1) keeps the ‘‘mass’’ range g to be nonzero on the interval [ - h , h J but be (close J \ w ( t ω ) ω\ dt of the wavelet constant as it is dilated, as ω to) zero outside that interval. The Gaussian function g ( t ) increases; otherwise the spectrogram will ‘‘fade out’’ at 2 2 - t /2h J= (1/h 2 τ ) e would serve the purpose. We would higher values of ω.J The parameter h can be adjusted to tune then for each t and ωcompute the transform to be more sensitive to frequency or more sen-
Am. J. Phys. 66 (9), September 1998 ? 1998 American Association of Physics Teachers 794 794
Fig. 2. The spectrogram of the sound of two tuning forks, one producing a 256-Hz signal, the other producing a 384-Hz signal; the second tuning fork was pulled away from the microphone after about 0.35 s.
cies of 10 and 13 Hz, respectively. Figure 1(a) shows a short-
time Fourier (Gabor) transform with a relatively narrow win-
dow width; time is well-localized but the two lower
frequency tones are not resolved. Figure 1(b) shows the same
type of transform but with a wider window; the two low
frequencies are now resolved but now the interruption in the
higher-frequency term is not resolved. [The values of h used
in Fig. 1(a) and (b) are, respectively, 0.05 and 0.3 s.J Figure
1(c) shows a continuous wavelet transform where both time
and frequency are well-localized. Note the vertical bars on
the ends of the notes as they appear in the spectrograms
reflect the sharp cut-off and cut-on of the tones—sharp edges
in a signal imply higher-frequency content.
Figures 2 – 4 show several sound samples, each taken at
9000 samples per second for 0.8 s using Vernier’s ULI A/D 1 board and microphone.Figure 2 shows the spectrogram of
the sound of two tuning forks, one producing the note ‘‘C’’
or do (256 Hz), the other producing the note ‘‘G’’ or so (384 Hz); the second tuning fork was pulled away from the mi- Fig. 1. Spectrograms of an artificial sinusoidal signal consisting of an inter- crophone after about 0.35 s. Figure 3 shows the spectrogram rupted 80-Hz pure tone superimposed over pure tones of 10 and 13 Hz. (a) of a whistle that is rising in pitch. The breaks in the plotting shows a short-time Fourier (Gabor) transform spectrogram with a narrow are caused by changes in volume of the whistle, perhaps due window ( h = 0.05 s) ; (b) shows the same type of spectrogram with a wide to the process of reshaping the mouth during the whistle. window ( h = 0.3 s) ; (c) shows a continuous wavelet transform spectrogram Figure 4 shows the spectrogram of one of the authors singing which resolves both time and frequency well. ? the notes do remi. There is a fundamental limitation on how well frequency and time can both be determined using this method. This sitive to time; h plays the same role as it does in the short- amounts to a type of uncertainty principle inherent in both time Fourier transform described above. the Fourier transform and the wavelet transform. This theo- See Fig. 1(c) for a spectrogram made using this transform rem says that a signal and its Fourier transform cannot both which shows good frequency and time localization for the have small support. More precisely, for a function t\ ( t ) with artificial signal shown in Fig. 1(a) and (b). Often, the scalo- gram is plotted with the frequency axis logarithmic, so that equal octaves in frequency occupy the same vertical width in the scalogram (an octave is an interval of frequency from a
given frequency to twice that frequency, just as in music).
Here, we provide the spectrograms of several example sig-
nals using the above transform to give an idea of the useful-
ness of the method. These were computed using C十十by the
authors; each pixel of the image represents a value of
Wf ( t , ω ) for a certain ( t , ω ) ; the value of that integral was
computed as a simple inner product. (The authors will pro-
vide pseudocode to any interested reader.)
Figure 1 shows spectrograms of an artificial signal. The
signal consists of several tones; one tone is an 80-Hz signal
which lasts from 1 to 4 s with a brief 0.2 s interruption at time 2 s. The other tones last from 1 to 7 s and have frequen- Fig. 3. The spectrogram of the sound of a whistle of rising pitch.
795 Am. J. Phys., Vol. 66, No. 9, September 1998 W. C. Lang and K. Forinash 795
? 2[this serves as a Plan- (3) 1/2τ J Vf ( t , ω ) d ω =\ f ( t ) \ - ?cheral formula; the idea is that at each t , Vf ( t , ω ) is an
instantaneous Fourier transformJ. ? 2 ˆ(4) J Vf ( t , ω ) dt =\ f ( ω ) \ .- ? 2 (5) 1/2τ J Vf ( t , ω ) W ( t , ω ) dtd ω =\ J f ( t ) g *( t ) dt \ (this is g Moyal’s formula, a sort of Parseval formula).
It turns out that the Wigner distribution is given by ? i ω ( -* Vf ( t , ω )= f ( t 十 ( /2) f ( t - ( /2)e d ( ;J - ? ? Fig. 4. The spectrogram of one of the authors singing the notes do remi. the properties above are easy to verify. [For example, the Harmonics corresponding to the different pitches can be clearly seen. ˆ integral in (3) follows from the fact that 1/2τ h( ω ) d ω J = h (0), where h is taken to be the function h ( ( ) = f ( t
十 ( /2) f *( t - ( /2) . J 2 norm 1 ( \ t\ ( t ) \ dt 1 ) , define the center of the function to J=But the Wigner distribution has limitations for use in ana- 2 lyzing signals. These include expense [knowledge of the en- be the number t = J t \ t\ ( t ) \ dt , and define the width of the 'tire signal is required to compute Vf ( t , ω ) for each t , and function to be there is no ‘‘fast’’ algorithm availableJ, and the fact that a 1/2spectrogram based on the Wigner distribution [a plotting of 2 2 Vf ( t , ω ) over the ( t , ω ) planeJ will show ‘‘interference’’ ar- A = ( t - t )\ t\( t )\ dt . '( J ) t\ tifacts [the sum of two Gaussian ‘‘notes’’ as in (2) above will result in a Wigner distribution spectrogram which shows 2 (The center of t\ is the expected value of \ t\\ in the sense of ‘‘noise’’ in regions where there should be noneJ. probability theory, and the width of t\ is the variation of It should be noted that other transforms or wavelets be- 2 ˆ \ t\\ . ) The uncertainty principle states A A 注 1/2. The prod- sides (1) or (2) may be chosen. Indeed, the discrete wavelet t\ t\uct in this inequality reaches its minimum value (of 1/2) transform is widely used because of its algorithmic proper-
exactly when the signal is a Gaussian. This explains why the ties.
In discrete wavelet transforms, only specific frequencies Morlet Gaussian wavelet works well at time and frequency
are considered; typically, a wavelet t\ is chosen and instead localization. (See Ref. 2 for an elementary proof and discus-
of considering t\ (( ( t ) ω ) for all ( t , ω ) in a certain rect- -sion of the uncertainty principle in the context of wavelets; - j angle, the discrete set of wavelets t\ ( t ) = t\ (2 t - k ) is Ref. 3 is a general mathematical reference on continuous jk used (here j and k are integers). (So only the angular fre- wavelet transforms.) j -In quantum mechanics, the probability density for the po- quencies 2 are dealt with.) It is possible to choose t\ such 2 2 sition of a particle is given by \ t\ ( r, t ) \ , where the position that the resulting set forms an orthogonal basis of L ( R), wave function t\ ( r, t ) is a function of position r and time t . and also such that the wavelet is nicely limited in time and in The momentum probability density is then given by frequency. Then a function f can be expressed as a ‘‘wavelet 2 \ ( ρ , t ) \ where the momentum waveform ( ρ , t ) is the *fIfIseries’’ f = 'i c t\ , where c = J f t\ dx , much like a jk jk jk jk jk Fourier transform (over the space variable r) of the position Fourier series. For certain wavelets t\, there are very simple, waveform. The uncertainty principle inherent in the Fourier fast algorithms for computing the wavelet coefficients c . jk transform then becomes the familiar statement that the mo- These algorithms rely on the fact that each coefficient is mentum and position of a particle cannot both be determined always a linear combination of other coefficients, so integrals to arbitrary precision. (See Ref. 4.) do not have to be computed for every coefficient; a wavelet The question of quantum uncertainty was treated by t\ with smaller support will result in simpler relations among Wigner and led eventually in the 1940s to the Ville – Wigner the coefficients, which makes algorithms simpler and faster. distribution of time and frequency (Ref. 5 or Ref. 6), an early The speed of these algorithms is a primary reason that attempt to express frequency as a function of time. The dis- wavelets have proven so valuable for applications. One key tribution Vf ( t , ω ) for a signal f ( t ) is intended to have the application is image compression. The coefficients of a Fou- following properties. rier series of a digital signal can be used to reconstruct the original signal. Since most of the coefficients are typically (1) If Vf ( t , ω ) is the Wigner distribution for a signal f ( t ), close to zero, they can be coded in a manner which takes less then Vf ( t - t , ω ) is the Wigner distribution for the sig- 0 computer memory; this has proven to be a useful way to nal f ( t - t ) ; and Vf ( t , ω - ω ) is the Wigner distribu- 0 0 store images in a compact way (the popular JPEG format - i ω t tion for the signal f ( t ) e (the Wigner distribution is uses this technique). Because of their sensitivity to the de- ‘‘time-frequency shift covariant’’). tails of an image, some discrete wavelet transforms have 1/2-(2) The Wigner distribution for the signal f ( t ) = h shown their value in image and video compression when 2 - 1/4- t /2employed in a similar manner; this is an area of active re- X exp(iωt)g((t-t)/h) [where g ( t ) = τ e J is 00222search and development. Vf ( t , ω ) 2 exp((tt)/h) exp(h(ωω)). Here f ( t ) =---00An overview and discussion of wavelets and many of their represents a signal concentrated both in time near t and 0 applications and the question of which wavelet choice is ap- in frequency near ω ; the Wigner distribution is a 0 propriate for a particular application is given by Ref. 5. Gaussian with respect to time and to frequency, concen- Other expository works include Refs. 7 – 9. Reference 10 trated at ( t , ω ). 0 0
796 Am. J. Phys., Vol. 66, No. 9, September 1998 W. C. Lang and K. Forinash 796
3M. Holschneider, Wavelets, An Analysis Tool (Clarendon, Oxford, 1995). gives a lucid account of the mathematical theory and devel- 4Albert Messiah, Quantum Mechanics, Vol. 1, translated by G. M. Temmer opment of discrete wavelet bases. Recent articles dealing (North-Holland, Amsterdam, 1970), Vol. I, p. 50.with discrete wavelet bases include Refs. 11 and 12; the 5Yves Meyer, Wavelets: Algorithms and Applications (translated and re- latter includes suggestions for other applications in physics. vised by Robert D. Ryan) (SIAM, Philadelphia, 1993). There is an extremely large literature on wavelets, involving 6Leon Cohen, Time-Frequency Analysis (Prentice-Hall, Englewood Cliffs, many variations in their construction and application. The NJ, 1995).7Wavelet Digest published on the World Wide Web (see Ref. Barbara Burke Hubbard, The World According to Wavelets: The Story of a 13) contains links to bibliographies and other information on Mathematical Technique in the Making (Peters, Wellesley, MA, 1998,)wavelets; commercial software packages are available for 2nd ed. 8?programs such as MATHEMATICA and MATLAB for persons Gerald Kaiser, A Friendly Guide to Wavelets (Birkhauser, Boston, 1994.)9Eugenio Hernandez and Guido L. Weiss, A First Course in Wavelets wishing to experiment with wavelets. (Studies in Advanced Mathematics) (CRC, Boca Raton, FL, 1996). 10Ingrid Daubechies, Ten Lectures on Wavelets, CBMS-NSF Regional Con- ACKNOWLEDGMENTS ference Series in Applied Mathematics, Vol. 61 (SIAM, Philadelphia, 1992). The authors would like to express their gratitude for the 11O. V. Vasilyev, D. A. Yuen, and S. Paolucci, ‘‘Solving PDEs Using very helpful criticisms of the referees. Wavelets,’’ Comput. Phys. 11 (5), 429 – 435 (1997). 121George D. J. Phillies, ‘‘Wavelets: A new alternative to Fourier trans- Vernier Software, 8565 S. W. Beaverton-Hillsdale Hwy., Portland, forms,’’ Comput. Phys. 10 (3), 247 – 252 (1996). OR 97225-2429. 132The URL of the Wavelet Digest is http://www.wavelet.org/wavelet/ Charles K. Chui, An Introduction to Wavelets (Academic, Boston, 1992), index.html. pp. 54 – 59.
797 Am. J. Phys., Vol. 66, No. 9, September 1998 W. C. Lang and K. Forinash 797