Fast Fourier Transform (FFT)

3.6 Fast Fourier Transform (FFT)

In this section we discuss the FFT algorithm that exploits the periodicity and sym- metry of the discrete-time complex exponentials e j 2π nk/N to reduce significantly the number of multiplications for the DFT computation. The FFT algorithm discussed NLOG here achieves its efficiency when N is a power of 2, i.e., N = 2 2 for some integer NLOG2. This makes no practical problem since the length of x[n] can be increased to the next power of 2 by zero-padding.

To get some understanding of the steps in the FFT algorithm, let us consider a NLOG sequence x[n] for 0 ≤ n ≤ N − 1 with N = 2 2 . There are two approaches,

3.6 Fast Fourier Transform (FFT) 165 each of which is based on the decimation process in the time and frequency domain,

respectively.

3.6.1 Decimation-in-Time (DIT) FFT

In this approach, we break the N -point DFT into two N /2-point DFTs, one for even-indexed subsequence x[2r ] and the other for odd-indexed subsequence x[2r +

1], then break each N /2-point DFT into two N /4-point DFTs and continue this process until 2-point DFTs appear. Specifically, we can write the N -point DFT of x [n] as

X (k) = x [n]W kn N =

(3.4.2) N −1

x [n]W kn

x [n]W kn

n=0 n=2r+1(odd) N N/ 2−1

n=2r(even)

= (2r +1)k

− j2π(2r)k/N =e − j2πrk/(N/2)

so that

X (3.6.1) (k) =X k

e (k) + W N X o (k) for

0 ≤ k ≤ N/2 − 1 (3.6.2a)

X (3.6.1) (k) =X k

e (k) + W N X o (k) for N/ 2 ≤ k ≤ N − 1;

X (3.6.1) (k + N/2) k+N/2 =X

e (k + N/2) + W N

X o (k + N/2)

N X o (k) for 0 ≤ k ≤ N/2 − 1 (3.6.2b) where X e (k) and X o (k) are N /2 -point DFTs that are periodic with period N /2 in

=X k e (k) − W

k . If N /2 is even, then we can again break X e (k) and X o (k) into two N /4 -point DFTs in the same way:

with N →N/2

=X ee (k) + W 2k N X eo (k) for 0 ≤ k ≤ N/2 − 1 (3.6.3a)

X (3.6.1) o (k) =

with N →N/2

=X oe

(k) + W 2k

N X oo (k) for 0 ≤ k ≤ N/2 − 1 (3.6.3b)

166 3 Discrete-Time Fourier Analysis NLOG If N = 2 2 , we repeat this procedure N L O G2 − 1 times to obtain N/2 2 -point

DFTs, say, for N = 2 3 , as

(k) = k N/ 4 x [4] (3.6.4a)

N/ 4−1

= x[0] + (−1) with x

X ee x ee [n]W kn

n=0

ee [n] = x e [2n] = x[2 n ]

N/ 4−1

N/ 4 = x[2] + (−1) x [6] (3.6.4b) with x

X eo (k) =

x eo [n]W kn

n=0

eo [n] = x e [2n + 1] = x[2 n + 2]

N/ 4 = x[1] + (−1) x [5] (3.6.4c) with x

oe (k) =

N/ X 4−1 x [n]W kn

n=0 oe

oe [n] = x o [2n] = x[2 n + 1]

oo (k) =

N/ X 4−1 x [n]W kn

N/ 4 = x[3] + (−1) x [7] (3.6.4d)

n=0

oo

oo [n] = x o [2n + 1] = x[2 n + 2 + 1] Along with this procedure, we can draw the signal flow graph for an 8-point DFT

with x

computation as shown in Fig. 3.15(a). By counting the number of branches with

a gain W r N (representing multiplications) and empty circles (denoting additions), we see that each stage needs N complex multiplications and N complex additions. Since there are log 2 N stages, we have a total of N log 2 N complex multiplications and additions. This is a substantial reduction of computation compared with the direct DFT computation, which requires N 2 complex multiplication and N (N − 1)

complex additions since we must get N values of X (k) for k = 0 : N − 1, each X(k) requiring N complex multiplications and N − 1 complex additions.

Remark 3.8 Data Rearrangement in “Bit Reversed” Order The signal flow graph in Fig. 3.15 shows that the input data x[n] appear in the bit reversed order: Position

Binary equivalent

Bit reversed

Sequence index

1 Remark 3.9 Simplified Butterfly Computation

(1) The basic computational block in the flow graph (Fig. 3.15(b)), called a butter

fly , for stage m + 1 represents the following operations such as Eq. (3.6.2):

X m+1 ( p) = X m

( p) + W r

(q), with q = p + 2 m (3.6.5a)

X m+1 (q) = X m

X m r (q) = X m ( p) − W N X m (q) (3.6.5b) (∵ W r +N/2

( p) + W r +N/2

= −W N )

3.6 Fast Fourier Transform (FFT) 167 Position

X e (0) Stage 3 0 x [0] = x ee [0]

0 0 0 W X(0) 2 X

X(1) W

2 1 X eo (0)

X 4 2 X(2) eo (1) x [6] X = x [1] e (3) W N

3 eo 1 3 W X(3) 2 X

W 4 X o (0) W N 3 4 x [1] = x oe [0] W 2 0 W X(4) 4 0 X (1) W 4 5 x [5] =

X(6) X o (3)

W N 6 7 x [7] = x oo [1]

W 2 0 X oo (1)

X(7) W

W 4 3 W N 7 (a) Signal flow graph of DIT(decimation-in-time) FFT algorithm

for computing an N = 8-point DFT

W N –1 (b) Simplifying the butterfly computation

x [0] X(0) x [4]

X(1) W 2 0 –1 x [2]

X(2) x [6] 4 0 W X(3) 2 –1 W 4 1 –1

W 0 –1

X(4) x [5]

x [1] W N 0 –1

X(5) W 2 0 –1 W 1 N –1 x [3]

2 –1 X(6) 4 0 W N

x [7]

X(7) W

(c) Signal flow graph of DIT(decimation-in-time) FFT algorithm

(with simplified butterfly pattern) for computing an N = 8-point DFT

Fig. 3.15 Signal flow graph of DIT(decimation-in-time) FFT algorithm

where p and q are the position indices in stage m + 1. Notice that X m+1 ( p) and X m+1 (q), the outputs of the butterfly at stage m + 1 are calculated in terms

of X m ( p) and X m (q), the corresponding values from the previous stage and no other inputs. Therefore, if a scratch memory for storing some intermediate results is available, X m+1 ( p) and X m+1 (q) can be calculated and be placed back into the storage registers for X m ( p) and X m (q). Thus only N registers for storing one complex array are needed to implement the complete computation. This kind of memory-efficient procedure is referred to as an in-place computation.

168 3 Discrete-Time Fourier Analysis (2) Noting that the horizontal multiplier is just the negative of the diagonal mul-

tiplier, the basic butterfly configuration can be further simplified. Specifically, with T = W r

N X m (q), we have the simplified butterfly computation

X m+1 ( p) = X m ( p) + T

X m+1 (q) = X m ( p) − T

which is depicted in Fig.3.15(b). This reduces the number of complex multi- plications for the DFT computation to half of that with the original butterfly computation (3.6.5a) (Fig. 3.15(c)):

N log 2 N →

log 2 N

3.6.2 Decimation-in-Frequency (DIF) FFT

An alternative algorithm for computing the DFT results from breaking up or decimating the transform sequence X (k). It begins by breaking up X (k) as follows:

X N −1 (k) x [n]W kn x [n]W kn x [n]W kn =

n=N/2 N

kN/ = 2 x [n]W kn

[n + N/2]W N

(x[n] + (−1) x [n + N/2])W N

n=0

We separate this into two groups, i.e., one group of the even-indexed elements (with k = 2r) and the other group of the odd-indexed elements (k = 2r + 1) of X(k);

X (2r ) = 2r n

N/ 2−1

(x[n] + x[n + N/2])W N

(x[n] + x[n + N/2])W N/ 2 for 0 ≤ r ≤ N/2 − 1

n=0

X (2r + 1) = (2r +1)n

N/ 2−1

(x[n] − x[n + N/2])W N

(3.6.7b)

n=0

= rn (x[n] − x[n + N/2])W N n=0 W N/ 2 for 0 ≤ r ≤ N/2 − 1 These are N /2 -point DFTs of the sequences (x[n] + x[n + N/2]) and (x[n] − x[n +

N/ 2−1

N , respectively. If N = 2 , we can proceed in the same way until it ends up with N /2 2-point DFTs. The DIF FFT algorithm with simplified butterfly computation and with the output data in the bit reversed order is illustrated for a 8-point DFT in Fig. 3.16.

N/ 2])W n

NLOG 2

3.6 Fast Fourier Transform (FFT) 169 x [0]

Fig. 3.16 Signal flow graph of DIF(decimation-in-frequency) FFT algorithm (with simplified butterfly pattern) for computing an N = 8-point DFT

3.6.3 Computation of IDFT Using FFT Algorithm

The inverse DFT is given by Eq. (3.4.3) as

1 N −1

j [n] = IDFT 2π kn/N {X(k)} = X (k)e

k=0

1 N −1

for n = 0 : N − 1 Comparing this with the DFT formula (3.4.2), we see that the computational pro-

X (k)W N −kn

k=0

cedure remains the same except that the twiddle factors are negative powers of W N and the output is scaled by 1/N . Therefore, an inverse fast Fourier transform (IFFT) program can be obtained from an FFT algorithm by replacing the input data x[n]’s with X (k)’s, changing the exponents of W N to negative values, and scaling the last output data by 1/N .

An alternative way to get the IDFT by using the FFT algorithm is derived by taking the complex conjugate of the IDFT formula (3.4.3) or (3.6.8) as follows:

1 N −1

[n] =

X ∗ (k)W N

kn

= N FFT{X ∗ (k)};

k=0

(3.6.9) It is implied that we can get x[n] = IDFT N {X(k)} by taking the complex conjugate

x [n] = ∗ ∗

(k)}

of X (k), applying the FFT algorithm for X ∗ (k), taking the complex conjugate of the output, and scaling by 1/N .

The MATLAB built-in functions “ fft(x,N)”/“ifft(X,N)” implement the N-point FFT/IFFT algorithm for the data x[n]/ X (k) given as their first input argu- ment if the second input argument N is given as a power of 2 or if N is not given and

170 3 Discrete-Time Fourier Analysis the length of the given data is a power of 2. If the length of the data sequence differs

from the second input argument N, it will be zero-padded or truncated so that the resulting length will be N.