Implementation and Mathematical logic behind the Fast Fourier Transform
Today, I will discuss about an important algorithm called Fast Fourier Transform that efficiently reduces the time complexity of convolution or multiplication of two polynomials.
Basically, a n-degree polynomial can be represented in many forms —
f(x) = a0 + a1*x + a2*x² + a3*x³ + … + an*x^n
- f = <a0,a1,a2,a3,a4,…,an>, this form is known as coefficient vector representation of a polynomial, where a0, a1, a2, a3,... are the coefficients corresponding to the x⁰,x¹, x², x³... respectively.
- Second form is the roots— f(x) = c1(x-x0)(x-x1)…(x-xn), where x0, x1, x2,…, xn are roots of the polynomial.
- Third is Samples or points representation — We need n+1 distinct points to uniquely represents a n-degree polynomial.
(Xk, Yk) for k = 0,1,2,3,…,n and we have Yk = f(Xk).
I will keep aside the root representation, and we will focus on others two.
So, now what we want to do is, multiply two polynomial A(x) and B(x) that were given in their coefficient vector representation and get the C(x) in the coefficient representation i.e.
Naive multiplication would be like, multiply all the coefficient of polynomial A(x) with coefficients of B(x) and get the corresponding coefficients for C(x). i.e.
Coefficient computation takes 1+2+3+… +n+n-1+n-2+n-3+…+1 = O(n²)time.
there are 2n-1 coefficients so time complexity would be O(n²). Can we do better?
Let’s consider some different points p1,p2,p3 …. pk and we will see that can we get our polynomial C(x) with these points. It might not be the coefficient representation but it can be in another form as well. We just need to get something out of it.
C(x) = A(x)*B(x) so does this implies given below?
C(p1) = A(p1)*B(p1), C(p2) = A(p2)*B(p2),…., C(pk) =A(pk)*B(pk). Yes it does.
We know that the degree of polynomial C(x) is (2*n-2) so how many points do we need to uniquely identify this?? Answer is 2*n-1 points.
Degree A(x) and B(x) is n-1 but we need to compute these polynomial at 2*n-1 points instead of n points.
Can we compute A(x) and B(x) at some 2*n-1 points and then multiply A(x) and B(x) to get C(x) at those points?? So that we get point representation of C(x)?? Yes we can do that, let’s see
How much time it takes to multiply those points?
There are 2*n-1 points, each multiplication takes O(1) time, so computation time for C(p1),C(p2)…from A(p1),A(p2)..and B(p1),B(p2)… would be O(2*n-1) = O(n). So O(n) is the time to compute point representation of C(x) from the point representation of A(x) and B(x).
We have seen that if we remain in our coefficient form then the time complexity for getting coefficient representation of C(x) from the Coefficient representation of A(x) and B(x) will be O(n²). So we will try another idea, We are given with their coefficient representation, we will somehow try to change the representation into it’s point form in some time complexity T1(n) and then we can multiply point representation of A(x) and B(x) to get point representation of C(x) in O(n) time complexity, after that we will get our coefficient form for C(x) from the point from C(x) in some time complexity T2(n).
Let maximum exponent power of x in A = a, maximum exponent power of x of B = b, then maximum exponent power of x in C will be c = a + b
then size of coefficient representation of C should be p = c+1, we will find a number n such that it is power of 2 and n≥p.
n = ceil(log2(p)), then add zero to A, and B to make their size equal to n to perform our algorithm on A and B.
Now we start our algorithm, see the below diagram.
Now, our main focus shifts to how to compute A(x) and B(x) at some different n points efficiently and in less time than O(n²).
Suppose {X} represent the set of points where we need to compute A and B and we have n-coefficients corresponding to A and B (after adding the zeros that I mentioned earlier).
Let T(n, X) represent the time complexity of conversion of A(x) with n coefficients from Coefficient form to point form at where every points belongs to set {X}.
let’s say Aeven = <a0,a2,a4,a6,..>, Aodd =<a1,a3,a5,a7…>
So A(x) = Aeven(x²) + x * Aodd(x²) . Right?
Our problem reduces to calculate Aeven and Aodd at {X²} points and then we can add them to get A(x) according to above expression.
T(n,X) = 2*T(n/2,X) + O(n + |X|)
n is for dividing A(x) into two half and|X| is for calculation of A(x) from already calculated Aeven and Aodd at {X²} . So total O(n+|X|) for merging, let’s draw recurrence tree.. suppose number of points is initially are n, then after dividing the array into two half, we will be calculating Aeven and Aodd at {X²}, the number of points will still be n, right?
for example for 4 points: X ={1,2,3,4} -> X ={1,4,9,16} ->X ={1,16,81,256}
So overall this would be O(nlogn + n²) = O(n²).. still the same, because the size of X wasn’t reducing at all. So is there any way for reducing the number of points on squaring? Can we choose n points such that if we square them, then the number of points reduces to half?
X = {p}² →{p²}
X = {-sqrt(p),sqrt(p)}² →{p}
It can be done if every time we choose a square root pair, then on squaring we will get a single number corresponding to two square roots.
{1}² = {1}, {-1,1}² = {1}, {i,-i,-1,1}² = {-1,1}
1st root ←2nd ←4th←8th…←nth root….of unity. So it only works for power of 2, we will add 0 to make size as power of 2, no problem with that right?
So can you guess such points? They are the complex roots of unity.
So now our complexity reduces to T1(n,n) = 2*T1(n/2,n/2) + O(n)
that is same as T1(n) = 2*T1(n/2) + O(n) == O(n*logn)
Coefficient → → → → → → Point (nlogn)
Well the expression written their was called fourier transform.
here y(k) = A(wn^k)
So the whole FFT process can be show like the matrix →
so Y = W*A
Winv = inverse of matrix W.
A = Winv * Y
So inverse can be given as -:
It is similar to Fast fourier transform where we just need to replace wn^k by wn^(-k).
- So we pad zeros to A and B to make their size enough.
- Then we convert them to their point representation (2*nlogn)
- We perform multiplication O(n)
- We perform Inverse fast fourier transform (nlogn)
So overall time complexity would be O(nlogn)
You can refer https://cp-algorithms.com/algebra/fft.html for more.