Digital Signal Processing

By Steven W. Smith, Ph.D.

- 1: The Breadth and Depth of DSP
- 2: Statistics, Probability and Noise
- 3: ADC and DAC
- 4: DSP Software
- 5: Linear Systems
- 6: Convolution
- 7: Properties of Convolution
- 8: The Discrete Fourier Transform
- 9: Applications of the DFT
- 10: Fourier Transform Properties
- 11: Fourier Transform Pairs
- 12: The Fast Fourier Transform
- 13: Continuous Signal Processing
- 14: Introduction to Digital Filters
- 15: Moving Average Filters
- 16: Windowed-Sinc Filters
- 17: Custom Filters
- 18: FFT Convolution
- 19: Recursive Filters
- 20: Chebyshev Filters
- 21: Filter Comparison
- 22: Audio Processing
- 23: Image Formation & Display
- 24: Linear Image Processing
- 25: Special Imaging Techniques
- 26: Neural Networks (and more!)
- 27: Data Compression
- 28: Digital Signal Processors
- 29: Getting Started with DSPs
- 30: Complex Numbers
- 31: The Complex Fourier Transform
- 32: The Laplace Transform
- 33: The z-Transform
- 34: Explaining Benford's Law

Your laser printer will thank you!

Speed

Writing a program to convolve one signal by another is a simple task, only
requiring a few lines of code. Executing the program may be more painful. The
problem is the large number of additions and multiplications required by the
algorithm, resulting in long execution times. As shown by the programs in the
last chapter, the time-consuming operation is composed of multiplying two
numbers and adding the result to an accumulator. Other parts of the algorithm,
such as indexing the arrays, are very quick. The multiply-accumulate is a basic
building block in DSP, and we will see it repeated in several other important
algorithms. In fact, the speed of DSP computers is often *specified* by how long
it takes to preform this multiply-accumulate operation.

If a signal composed of *N* samples is convolved with a signal composed of *M*
samples, *N*×*M* multiply-accumulations must be preformed. This can be seen
from the programs of the last chapter. Personal computers of the mid 1990's
requires about one microsecond per multiply-accumulation (100 MHz Pentium
using single precision floating point, see Table 4-6). Therefore, convolving a
10,000 sample signal with a 100 sample signal requires about one second. To
process a one million point signal with a 3000 point impulse response requires
nearly an hour. A decade earlier (80286 at 12 MHz), this calculation would
have required three days!

The problem of excessive execution time is commonly handled in one of three ways. First, simply keep the signals as short as possible and use integers instead of floating point. If you only need to run the convolution a few times, this will probably be the best trade-off between execution time and programming effort. Second, use a computer designed for DSP. DSP microprocessors are available with multiply-accumulate times of only a few tens of nanoseconds. This is the route to go if you plan to perform the convolution many times, such as in the design of commercial products.

The third solution is to use a better algorithm for implementing the convolution.
Chapter 17 describes a very sophisticated algorithm called *FFT convolution*.
FFT convolution produces exactly the same result as the convolution algorithms
presented in the last chapter; however, the execution time is dramatically
reduced. For signals with thousands of samples, FFT convolution can be
hundreds of times faster. The disadvantage is program complexity. Even if you
are familiar with the technique, expect to spend several hours getting the
program to run.