The idea of the FFT is actually attributable to Gauss, as if we should be surprised by this, but the computational potential was not realized until the mid nineteen sixties, when the FFT was rediscovered.
The basic notion is to perform the Fourier transform recursively to cut down the complexity or computation step number of the process. The algorithm is far from obvious if we restrict ourselves to expressing indices of arrays in base ten, but jumps out at us if we label the data array using base two number representations.
Consider a set of data
The discrete Fourier transform will be a sum of terms, each a product
FFT develops a recursion for doing the sum in steps, each a product, for a dataset with . Label the indices in base two
consistent with how we read our numbers, largest exponent (most significant) digits leftmost in the string. In binary each is zero or one. Then if we call
the Discrete Fourier transform is
This formula can be expanded from the inside out; for example consider the case of data , then
expand the sum:
expand the sum;
expand the sum;
Examine the order in which are added into the sum; they are added in pairs: we rearrange them in reverse bit-order, take the first two and combine them, then the next two and combine them, then the next two, then the next two to create four objects
each term has the slot on the two summands be and slots run over the four possibilities that give us the four terms. Now we have four terms that can be binary-labeled, and we repeat the process with these four objects by making
finally
You can see that the order in which the are added in is the reverse of their binary ordering, this is called reverse bit ordering, read the binary number from right to left rather than from left to right
If the original vector of data is first reverse bit ordered, we can perform an iterative summation, looping only rather than times, each loop containing one addition and one multiplication. This means far fewer computations than straightforward application of the defining formula for Fourier transformation in terms of a sum up to .
This is why the algorithm is faster.
The looping and assignment can be illustrated in the following way
in which
and the subscripts on the label the iteration number, not an exponent. Each loop runs half as many cycles as its predecessor.
Let’s first code this up in C, the lowest level language that we are using in these posts, therefore the fastest but most difficult to use.
#include<stdio.h>
#include<math.h>
#define PI 3.1415926
#define N 64
#define M 6 /* this is log_2(N) */
int i1, i2, i3;
int i, j, k, l, m;
double f_imag[N],f_real[N], theta, ctheta, stheta;
int l1, l2;
double tmp1, tmp2;
main()
{
for(i=0;i<N;i++){
f_real[i]=cos(PI*(double)i/64.0);
f_imag[i]=0.0;}
/* bit reversal */
l = 1;
for (k = 0; k <N-1; k++) {
if (k < l) {
tmp1 = f_real[l-1];
tmp2 = f_imag[l-1];
f_real[l-1] = f_real[k];
f_imag[l-1] = f_imag[k];
f_real[k] = tmp1;
f_imag[k] = tmp2;
}
j = N/2;
while(j < l) {
l = l - j;
j = j/2;
}
l = l + j;
}
/* actual FFT routine */
l2 = 0;
i1 = M;
for (l = 0; l < i1; l++) {
theta = 0.0;
l1 = l2;
l2 = 2*l1 ; /* bit shift to right = multiply by two */
for (k = 0; k <i2; k++) {
ctheta = cos(theta);
stheta = -sin(theta);
theta = theta + PI / (double)l1;
i3 = l2;
for (j = k; i3 < 0 ? j > N : j < N; j += i3) {
i = j + l1;
tmp1 = f_real[i] * ctheta - f_imag[i] * stheta;
tmp2 = f_real[i ] * stheta + f_imag[i] * ctheta;
f_real[i ] = f_real[j ] - tmp1;
f_real[j ] = f_real[j ] + tmp1;
f_imag[i ] = f_imag[j ] - tmp2;
f_imag[j ] = f_imag[j ] + tmp2;
}
}
}
/* the end */
for(i=0;i<N;i++)
printf("%d\t%f\n", i, f_real[i]);
}
Next time: implemented in julia and R, canned libraries.