Fast Fourier Transform (FFT)
3.6 Fast Fourier Transform (FFT)
In this section we discuss the FFT algorithm that exploits the periodicity and sym- metry of the discrete-time complex exponentials e j 2π nk/N to reduce significantly the number of multiplications for the DFT computation. The FFT algorithm discussed NLOG here achieves its efficiency when N is a power of 2, i.e., N = 2 2 for some integer NLOG2. This makes no practical problem since the length of x[n] can be increased to the next power of 2 by zero-padding.
To get some understanding of the steps in the FFT algorithm, let us consider a NLOG sequence x[n] for 0 ≤ n ≤ N − 1 with N = 2 2 . There are two approaches,
3.6 Fast Fourier Transform (FFT) 165 each of which is based on the decimation process in the time and frequency domain,
respectively.
3.6.1 Decimation-in-Time (DIT) FFT
In this approach, we break the N -point DFT into two N /2-point DFTs, one for even-indexed subsequence x[2r ] and the other for odd-indexed subsequence x[2r +
1], then break each N /2-point DFT into two N /4-point DFTs and continue this process until 2-point DFTs appear. Specifically, we can write the N -point DFT of x [n] as
X (k) = x [n]W kn N =
(3.4.2) N −1
x [n]W kn
x [n]W kn
n=0 n=2r+1(odd) N N/ 2−1
n=2r(even)
= (2r +1)k
− j2π(2r)k/N =e − j2πrk/(N/2)
so that
X (3.6.1) (k) =X k
e (k) + W N X o (k) for
0 ≤ k ≤ N/2 − 1 (3.6.2a)
X (3.6.1) (k) =X k
e (k) + W N X o (k) for N/ 2 ≤ k ≤ N − 1;
X (3.6.1) (k + N/2) k+N/2 =X
e (k + N/2) + W N
X o (k + N/2)
N X o (k) for 0 ≤ k ≤ N/2 − 1 (3.6.2b) where X e (k) and X o (k) are N /2 -point DFTs that are periodic with period N /2 in
=X k e (k) − W
k . If N /2 is even, then we can again break X e (k) and X o (k) into two N /4 -point DFTs in the same way:
with N →N/2
=X ee (k) + W 2k N X eo (k) for 0 ≤ k ≤ N/2 − 1 (3.6.3a)
X (3.6.1) o (k) =
with N →N/2
=X oe
(k) + W 2k
N X oo (k) for 0 ≤ k ≤ N/2 − 1 (3.6.3b)
166 3 Discrete-Time Fourier Analysis NLOG If N = 2 2 , we repeat this procedure N L O G2 − 1 times to obtain N/2 2 -point
DFTs, say, for N = 2 3 , as
(k) = k N/ 4 x [4] (3.6.4a)
N/ 4−1
= x[0] + (−1) with x
X ee x ee [n]W kn
n=0
ee [n] = x e [2n] = x[2 n ]
N/ 4−1
N/ 4 = x[2] + (−1) x [6] (3.6.4b) with x
X eo (k) =
x eo [n]W kn
n=0
eo [n] = x e [2n + 1] = x[2 n + 2]
N/ 4 = x[1] + (−1) x [5] (3.6.4c) with x
oe (k) =
N/ X 4−1 x [n]W kn
n=0 oe
oe [n] = x o [2n] = x[2 n + 1]
oo (k) =
N/ X 4−1 x [n]W kn
N/ 4 = x[3] + (−1) x [7] (3.6.4d)
n=0
oo
oo [n] = x o [2n + 1] = x[2 n + 2 + 1] Along with this procedure, we can draw the signal flow graph for an 8-point DFT
with x
computation as shown in Fig. 3.15(a). By counting the number of branches with
a gain W r N (representing multiplications) and empty circles (denoting additions), we see that each stage needs N complex multiplications and N complex additions. Since there are log 2 N stages, we have a total of N log 2 N complex multiplications and additions. This is a substantial reduction of computation compared with the direct DFT computation, which requires N 2 complex multiplication and N (N − 1)
complex additions since we must get N values of X (k) for k = 0 : N − 1, each X(k) requiring N complex multiplications and N − 1 complex additions.
Remark 3.8 Data Rearrangement in “Bit Reversed” Order The signal flow graph in Fig. 3.15 shows that the input data x[n] appear in the bit reversed order: Position
Binary equivalent
Bit reversed
Sequence index
1 Remark 3.9 Simplified Butterfly Computation
(1) The basic computational block in the flow graph (Fig. 3.15(b)), called a butter
fly , for stage m + 1 represents the following operations such as Eq. (3.6.2):
X m+1 ( p) = X m
( p) + W r
(q), with q = p + 2 m (3.6.5a)
X m+1 (q) = X m
X m r (q) = X m ( p) − W N X m (q) (3.6.5b) (∵ W r +N/2
( p) + W r +N/2
= −W N )
3.6 Fast Fourier Transform (FFT) 167 Position
X e (0) Stage 3 0 x [0] = x ee [0]
0 0 0 W X(0) 2 X
X(1) W
2 1 X eo (0)
X 4 2 X(2) eo (1) x [6] X = x [1] e (3) W N
3 eo 1 3 W X(3) 2 X
W 4 X o (0) W N 3 4 x [1] = x oe [0] W 2 0 W X(4) 4 0 X (1) W 4 5 x [5] =
X(6) X o (3)
W N 6 7 x [7] = x oo [1]
W 2 0 X oo (1)
X(7) W
W 4 3 W N 7 (a) Signal flow graph of DIT(decimation-in-time) FFT algorithm
for computing an N = 8-point DFT
W N –1 (b) Simplifying the butterfly computation
x [0] X(0) x [4]
X(1) W 2 0 –1 x [2]
X(2) x [6] 4 0 W X(3) 2 –1 W 4 1 –1
W 0 –1
X(4) x [5]
x [1] W N 0 –1
X(5) W 2 0 –1 W 1 N –1 x [3]
2 –1 X(6) 4 0 W N
x [7]
X(7) W
(c) Signal flow graph of DIT(decimation-in-time) FFT algorithm
(with simplified butterfly pattern) for computing an N = 8-point DFT
Fig. 3.15 Signal flow graph of DIT(decimation-in-time) FFT algorithm
where p and q are the position indices in stage m + 1. Notice that X m+1 ( p) and X m+1 (q), the outputs of the butterfly at stage m + 1 are calculated in terms
of X m ( p) and X m (q), the corresponding values from the previous stage and no other inputs. Therefore, if a scratch memory for storing some intermediate results is available, X m+1 ( p) and X m+1 (q) can be calculated and be placed back into the storage registers for X m ( p) and X m (q). Thus only N registers for storing one complex array are needed to implement the complete computation. This kind of memory-efficient procedure is referred to as an in-place computation.
168 3 Discrete-Time Fourier Analysis (2) Noting that the horizontal multiplier is just the negative of the diagonal mul-
tiplier, the basic butterfly configuration can be further simplified. Specifically, with T = W r
N X m (q), we have the simplified butterfly computation
X m+1 ( p) = X m ( p) + T
X m+1 (q) = X m ( p) − T
which is depicted in Fig.3.15(b). This reduces the number of complex multi- plications for the DFT computation to half of that with the original butterfly computation (3.6.5a) (Fig. 3.15(c)):
N log 2 N →
log 2 N
3.6.2 Decimation-in-Frequency (DIF) FFT
An alternative algorithm for computing the DFT results from breaking up or decimating the transform sequence X (k). It begins by breaking up X (k) as follows:
X N −1 (k) x [n]W kn x [n]W kn x [n]W kn =
n=N/2 N
kN/ = 2 x [n]W kn
[n + N/2]W N
(x[n] + (−1) x [n + N/2])W N
n=0
We separate this into two groups, i.e., one group of the even-indexed elements (with k = 2r) and the other group of the odd-indexed elements (k = 2r + 1) of X(k);
X (2r ) = 2r n
N/ 2−1
(x[n] + x[n + N/2])W N
(x[n] + x[n + N/2])W N/ 2 for 0 ≤ r ≤ N/2 − 1
n=0
X (2r + 1) = (2r +1)n
N/ 2−1
(x[n] − x[n + N/2])W N
(3.6.7b)
n=0
= rn (x[n] − x[n + N/2])W N n=0 W N/ 2 for 0 ≤ r ≤ N/2 − 1 These are N /2 -point DFTs of the sequences (x[n] + x[n + N/2]) and (x[n] − x[n +
N/ 2−1
N , respectively. If N = 2 , we can proceed in the same way until it ends up with N /2 2-point DFTs. The DIF FFT algorithm with simplified butterfly computation and with the output data in the bit reversed order is illustrated for a 8-point DFT in Fig. 3.16.
N/ 2])W n
NLOG 2
3.6 Fast Fourier Transform (FFT) 169 x [0]
Fig. 3.16 Signal flow graph of DIF(decimation-in-frequency) FFT algorithm (with simplified butterfly pattern) for computing an N = 8-point DFT
3.6.3 Computation of IDFT Using FFT Algorithm
The inverse DFT is given by Eq. (3.4.3) as
1 N −1
j [n] = IDFT 2π kn/N {X(k)} = X (k)e
k=0
1 N −1
for n = 0 : N − 1 Comparing this with the DFT formula (3.4.2), we see that the computational pro-
X (k)W N −kn
k=0
cedure remains the same except that the twiddle factors are negative powers of W N and the output is scaled by 1/N . Therefore, an inverse fast Fourier transform (IFFT) program can be obtained from an FFT algorithm by replacing the input data x[n]’s with X (k)’s, changing the exponents of W N to negative values, and scaling the last output data by 1/N .
An alternative way to get the IDFT by using the FFT algorithm is derived by taking the complex conjugate of the IDFT formula (3.4.3) or (3.6.8) as follows:
1 N −1
[n] =
X ∗ (k)W N
kn
= N FFT{X ∗ (k)};
k=0
(3.6.9) It is implied that we can get x[n] = IDFT N {X(k)} by taking the complex conjugate
x [n] = ∗ ∗
(k)}
of X (k), applying the FFT algorithm for X ∗ (k), taking the complex conjugate of the output, and scaling by 1/N .
The MATLAB built-in functions “ fft(x,N)”/“ifft(X,N)” implement the N-point FFT/IFFT algorithm for the data x[n]/ X (k) given as their first input argu- ment if the second input argument N is given as a power of 2 or if N is not given and
170 3 Discrete-Time Fourier Analysis the length of the given data is a power of 2. If the length of the data sequence differs
from the second input argument N, it will be zero-padded or truncated so that the resulting length will be N.