The FFT Algorithm

4.7.1 The FFT Algorithm

Consider the discrete Fourier transform (DFT)

of a function, whose values f (x k ) are known at the grid points x k = 2πk/N, k = 0 : N − 1. According to Theorem 4.6.4 the coefficients are given by

1 N −1

f (x k )e −ijx k , j

= 0 : N − 1. (4.7.1)

k =0

The evaluation of expressions of the form (4.7.1) occur also in discrete approximations to the Fourier transform.

Setting ω N =e −2πi/N this becomes

1 N −1

c jk

f (x k ), j = 0 : N − 1, (4.7.2)

k =0

where ω N is an Nth root of unity, (ω N ) N = 1. It seems from (4.7.2) that computing the discrete Fourier coefficients would require N 2 complex multiplications and additions. As we shall see, only about N log 2 N complex multiplications and additions are required using an algorithm called the fast Fourier transform (FFT). The modern usage of the FFT started in 1965 with the publication of [78] by James W. Cooley of IBM Research and John W. Tukey, Princeton University. 162 In many areas of application (digital signal processing, image processing, time-series analysis, to name

a few), the FFT has caused a complete change of attitude toward what can be done using discrete Fourier methods. Without the FFT many modern devices such as cell phones, digital cameras, CT scanners, and DVDs would not be possible. Some applications considered in astronomy require FFTs of several gigapoints.

162 Tukey came up with the basic algorithm at a meeting of President Kennedy’s Science Advisory Committee. One problem discussed at this meeting was that the ratification of a US–Soviet nuclear test ban depended on a fast

method to detect nuclear tests by analyzing seismological time-series data.

504 Chapter 4. Interpolation and Approximation In the following we will use the common convention not to scale the sum in (4.7.2)

by 1/N.

Definition 4.7.1.

The DFT of the vector f ∈ C N is

(4.7.3) where F n

y =F N f,

∈C N ×N is the DFT matrix with elements

(4.7.4) where ω N =e −2πi/N . 163 From the definition it follows that the DFT matrix F is a complex Vandermonde

(F jk

N ) jk =ω N , j, k = 0 : N − 1,

matrix. Since ω N =ω N ,F N is symmetric. By Theorem 4.6.4

jk kj

= I,

where F H N is the complex conjugate transpose of F N . Hence the matrix √ 1 N F N is a unitary matrix and the inverse transform can be written as

f = H F N y. N

We now describe the central idea of the FFT algorithm, which is based on the divide and conquer strategy (see Sec. 1.2.3). Assume that N = 2 p and set

k = 1 2k 1 0≤k 1 + 1 if k is odd, ≤ m − 1, where m = N/2 = 2 p −1 . Split the DFT sum into an even and an odd part:

2k

if k is even,

Let β be the quotient and j 1 the remainder when j is divided by m, i.e., j = βm+j 1 . Then, since ω N N = 1,

= (ω N ) 1 (ω 2 N ) 1 1 =ω m 1 1 . Thus if, for j 1 = 0 : m − 1, we set

f 2k 1 +1 1 ω m 1 , (4.7.5)

k 1 =0

k 1 =0

163 Some authors set ω N =e 2πi/N . Which convention is used does not much affect the development.

4.7. The Fast Fourier Transform 505 then y j

j =φ j 1 +ω N ψ j 1 . The two sums on the right are elements of the DFTs of length N/ 2 applied to the parts of f with odd and even subscripts. The entire DFT of length N is obtained by combining these two DFTs! Since ω m N = −1, we have

(4.7.6) y j

j 1 =φ j

1 +ω N ψ j 1 ,

1 −ω N ψ j 1 , j 1 = 0 : N/2 − 1. (4.7.7) These expressions, noted already by Danielson and Lanczos [90], are often called butterfly

j 1 +N/2 =φ j

relations because of the data flow pattern. Note that these can be performed in place, i.e., no extra vector storage is needed.

The computation of φ j 1 and ψ j 1 means that one does two Fourier transforms with m = N/2 terms instead of one with N terms. If N/2 is even the same idea can be applied to these two Fourier transforms. One then gets four Fourier transforms, each of which has N/

4 terms. If N = 2 p , this reduction can be continued recursively until we get N DFTs with one term. But F 1 = I , the identity. A recursive MATLAB implementation of the FFT

algorithm is given in Problem 4.7.2.

Example 4.7.1.

For n = 2 2 = 4, we have ω 4 =e −πi/2 = −i, and the DFT matrix is  1 1 1 1   1 1 1 1 

F 4 −i) =  2 4 6  

−i (−i) 3   1 −i −1

1 i −1 −i It is symmetric and its inverse is

The number of complex operations (one multiplication and one addition) required to

compute {y p } from the butterfly relations when {φ

1 } and {ψ 1 } have been computed is 2 , assuming that the powers of ω are precomputed and stored. Thus, if we denote by q p the total number of operations needed to compute the DFT when N = 2 p , we have

p ≤ 2q p −1 +2 , p ≥ 1.

Since q 0 = 0, it follows by induction that q p ≤p·2 p = N · log 2 N . Hence, when N is a power of

2, the FFT solves the problem with at most N · log 2 N operations . For example, when N = 2 20 = 1,048,576 the FFT algorithm is theoretically a factor

of 84,000 faster than the “conventional” O(N 2 ) algorithm. On a 3 GHz laptop, a real FFT of this size takes about 0.1 second using MATLAB 6, whereas more than two hours would

be required by the conventional algorithm! The FFT not only uses fewer operations to evaluate the DFT, it also is more accurate. Whereas when using the conventional method the roundoff error is proportional to N, for the FFT algorithm it is proportional to log 2 N .

506Chapter 4. Interpolation and Approximation

Example 4.7.2.

Let N = 2 4 = 16. Then the 16-point DFT (0:1:15) can be split into two 8-point DFTs (0:2:14) and (1:2:15), which can each be split in two 4-point DFTs. Repeating these

splittings we finally get 16 one-point DFTs which are the identity F 1 = 1. The structure of this FFT is illustrated below.

[0] [8] [4] [12] [2] [10] [6] [14] [1] [9] [5] [13] [3] [11] [7] [15] In most implementations the explicit recursion is avoided. Instead the FFT algorithm

is implemented in two stages: • a reordering stage in which the data vector f is permuted; • a second stage in which first N/2 FFT transforms of length 2 are computed on adjacent

elements, followed by N/4 transforms of length 4, etc., until the final result is obtained by merging two FFTs of length N/2.

We now consider each stage in turn. Each step of the recursion involves an even–odd permutation. In the first step the points with last binary digit equal to 0 are ordered first and those with last digit equal to 1 are ordered last. In the next step the two resulting subsequences of length N/2 are reordered according to the second binary digit, etc. It is not difficult to see that the combined effect of the reordering in stage 1 is a bit-reversal permutation of the data points. For i = 0 : N −1, let the index i have the binary expansion

i t =b 0 +b 1 ·2+···+b

t −1 ·2 −1

and set r(i)

t −1 +···+b 1 ·2 −2 +b 0 ·2 −1 . That is, r(i) is the index obtained by reversing the order of the binary digits. If i < r(i),

=b t

then exchange f i and f r(i) . This reordering is illustrated for N = 16 below.

4.7. The Fast Fourier Transform 507 Decimal Binary

Decimal Binary

15 1111 We denote the permutation matrix performing the bit-reversal ordering by P N . Note

that if an index is reversed twice we end up with the original index. This means that

P −1

=P N =P N ,

i.e., P N is symmetric. The permutation can be carried out “in place” by a sequence of pairwise interchanges or transpositions of the data points. For example, for N = 16 the pairs

(1,8), (2,4), (3,12), (5,10), (7,14), and (11,13) are interchanged. The bit-reversal permutation can take a substantial fraction of the total time to do the FFT. Which implementation is best depends strongly on the computer architecture.

We now consider the second stage of the FFT. The key observation to develop a matrix-oriented description of this stage is to note that the Fourier matrices F N after an odd–even permutation of the columns can be expressed as a 2 × 2 block matrix, where each

block is either F N/ 2 or a diagonal scaling of F N/ 2 .

Theorem 4.7.2 (Van Loan [366, Theorem 1.2.1]). Let Z T N

be the permutation matrix which applied to a vector groups the even-indexed components first and the odd-indexed last. 164 If N = 2m, then

F m −< m F m

I m −< m

(4.7.9) Proof. The proof essentially follows from the derivation of the butterfly relations (4.7.6)–

m = diag (1, ω N ,...,ω N −1 ), ω N =e −2πi/N .

164 Note that Z T N =Z −1 N is the so-called perfect shuffle permutation, which in the permuted vector Z T N f is obtained by splitting f in half and then “shuffling” the top and bottom halves.

508 Chapter 4. Interpolation and Approximation

Example 4.7.3.

We illustrate Theorem 4.7.2 for N = 2 2 = 4. The DFT matrix F 4 is given in Example 4.7.1. After a permutation of the columns F 4 can be written as a 2 × 2 block-

, < 2 1 −1 = diag (1, −i).

When N = 2 p the FFT algorithm can be interpreted as a sparse factorization of the DFT matrix

F N =A k ···A 2 A 1 P N ,

(4.7.10) where P N is the bit-reversal permutation matrix and A 1 ,...,A k are block-diagonal matrices,

A q = diag (B L ,...,B

L ), L =2 , r = N/L. (4.7.11)

8 9: r

Here the matrix B k

∈C L ×L is the radix-2 butterfly matrix defined by

< L/ L/ 2 = diag (1, ω L ,...,ω 2−1 L ), ω L =e −2πi/L . (4.7.13) The FFT algorithm described above is usually referred to as the Cooley–Tukey FFT

algorithm. Using the fact that both the bit-reversal matrix P N and the DFT matrix F n are symmetric, we obtain by transposing (4.7.10) the factorization

(4.7.14) This gives rise to a “dual” FFT algorithm, referred to as the Gentleman–Sande algo-

N =F N =P N A 1 A 2 ···A k .

rithm [155]. In this the bit-reversal permutation comes after the other computations. In many important applications, such as convolution and the solution of discretized Poisson equations (see Sec. 1.1.4), this permits the design of in-place FFT solutions that avoid bit- reversal altogether; see Van Loan [366, Secs. 4.1, 4.5].

In the operation count for the FFT above we assumed that the weights ω j L ,j=1: L − 1, ω L =e −2πi/L , are precomputed. To do this one could use that

L = cos(jθ) − i sin(jθ), θ = 2π/L,

for L = 2 q , q = 2 : k. This is accurate, but expensive, since it involves L−1 trigonometric functions calls. An alternative is to compute ω = cos(θ) − i sin(θ) and use repeated

multiplication,

= ωω j −1 , j = 2 : L − 1.

This replaces one sine/cosine call with a single complex multiplication, but has the drawback that accumulation of roundoff errors will give an error in ω j

L of order ju.

4.7. The Fast Fourier Transform 509

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

An Analysis of illocutionary acts in Sherlock Holmes movie

27 148 96

The Effectiveness of Computer-Assisted Language Learning in Teaching Past Tense to the Tenth Grade Students of SMAN 5 Tangerang Selatan

4 116 138

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Existentialism of Jack in David Fincher’s Fight Club Film

5 71 55

Phase response analysis during in vivo l 001

2 30 2

The Risk and Trust Factors in Relation to the Consumer Buying Decision Process Model

0 0 15

PENERAPAN ADING (AUTOMATIC FEEDING) PINTAR DALAM BUDIDAYA IKAN PADA KELOMPOK PETANI IKAN SEKITAR SUNGAI IRIGASI DI KELURAHAN KOMET RAYA, BANJARBARU Implementation of Ading (Automatic Feeding) Pintar in Fish Farming on Group of Farmer Close to River Irriga

0 0 5