Every Intuition Behind Fourier Transform
Here, we will derive the Fourier Transform formula ourselves, from scratch, which is:
$A(\omega) = \int_{-\infty}^{+\infty} f(x)e^{-i\omega x}dx$Donβt be scared; it will be much easier than you imagine. After that, we can dive into the fast version of it. They are pretty similar.
We know that we have to evaluate these waves. If the evaluation results are correlated with each other, we can say these two waves are correlated (one of them is a component of another).
Previously, we were picking peak points for this evaluation. But we might as well have picked any periodical point. Because, recall, our aim was to see how much candidate
and complex
wave are related.
In fact, it would be even better if we are to evaluate the complex
wave at every point of the candidate
wave. This would have been too much of a hassle; thatβs why we went with periodical points. Now, we can be more hardcore.
There are infinite points in a wave, right? How to add infinite points?
A good mathematical approach for that would be taking the integral of the candidate wave and comparing the result with the complex wave. Taking the integral of a function means finding their area with respect to the x-axis. See the image below:
We have two waves here:
Orange
one is the complex wave (the square one). Yes, waves can be square too.Cyan
one is a basic sin wave.Letβs inspect the left part of the image:
Orange
wave is positive on average on the left part.
Yes, it is not only on average, but always positive. However, we only care for the average (or sum) here.
And if we inspect Cyan
wave, we also see it is positive on average.
Now, the right part:
Both waves are negative on average. Wonderful!
That means, the sin wave (Cyan
) is a component of the complex wave (Orange
). See how easy it was to interpret using integrals!
Instead of separating the waves into regions and inspecting the negativity/positivity region by region, we can have another math trick. What is this trick? It is multiplication!
Remember, what we want is: the signs of the waves to be the same (if one wave is negative in one region, the other one should be negative as well, same for the positive). Recall the property of multiplication below:
$(+)(+) = (+) \newline (-)(-) = (+)$So, if we multiply the two regionsβ integrals, and if the result is positive, that means the regionsβ positivity/negativity is matching! Exactly what we wanted :)
You might have noticed, division
would also work fine. We only care for detecting the same signs. I will go with multiplication for now because the illustrations will be simpler. But keep in mind that, division is also a valid candidate.
To demonstrate what multiplying two waves look like, here is $\sin(x)$:
And here is an example of multiplying $\sin(x)$ with itself. Images are generated from Wolfram Alpha.
Here is an important outcome, since $\sin(x)$ is correlated to itself (duh), we donβt have any negative values for $\sin^2(x)$. We multiplied those negative values with themselves and got positive values!
A quick summary of what we will do:
How easy was that! Wow! Oh MY GOD!
You can watch a great video about this (the above images above are also copied from that video)
Recap: if we want to figure out if $\sin(3x)$ is a component of a wave of a complex function $f(x)$, then we can write:
$\int f(x)\sin(3x)dx$And, if we want to generalize our function, we can introduce a Greek letter to scare people off. Letβs use omega: $\omega$
$A(\omega) = \int f(x)\sin(\omega x)dx$Which translates to: if $A(\omega)$ is positive, it means $\sin(\omega x)$ is a part of our complex wave $f(x)$.
We do not know whatβs inside $f(x)$. It is a black-box to us. We only know that the data points generated by $f(x)$. If we had known the internals of $f(x)$, we could have figured out the components of $f(x)$ without having to deal with all these. The goal here is to decrypt the complex wave denoted $f(x)$, which is a black-box.
In other words, $f(x)$ can be: $3\sin(5x) + \sin(2x)$, or $57\sin(7x) + 41\sin(3x)$, or any other complex wave. We donβt know. Thatβs what are going to discover using Fourier Transform.
sin(x)
and sin(2x)
(we donβt know it yet)we should get a positive value for analyzing frequency 1
:
we should get a positive value for analyzing frequency 2
:
we will inspect other frequencies (3
, 4
, and so onβ¦), but the result will be always 0, since our complex wave does not include other frequencies (for example, sin(3x)
is not part of our complex wave).
Note that, the value for A(1)
should be 3 times the value we get for A(2)
, because the amplitude of sin(x)
is 3
in our complex wave, whereas the amplitude of sin(2x)
is 1
.
Letβs plug them in (but we canβt compute for infinity with calculators, so I will be merciful to calculators and compute only for 2Ο
.
Oh myβ¦ 9.4248
is exactly 3 times of 3.1416
!
And, lastly, an example for a wave that is not a component of the complex wave:
$A(3) = \int_0^{2\pi} (3\sin(x) + \sin(2x))\sin(3x) dx = 0$$A(\omega)$ is just giving us the coefficients (amplitudes) of the related sine
wave. Very usefulβ¦
Note before the next section: to increase readability, I might omit the dx
from the integrals
OUTCOME:
complex
wave and the candidate
wave, we can multiply the complex
wave with the candidate
wave.And the magnitude of the result means, how much the frequencies match:
$A(\omega) = \int f(x)sin(\omega x)dx$sinx
is not enoughThe problem with the below equation is:
$A(\omega) = \int f(x)\sin(\omega x)$It cannot give us the phase of the sine wave. We can detect which sine wave is in the complex function f(x)
, and we can even detect the amplitude. But we cannot detect the sine wave if the phase is shifted. Here are some shifted sine waves:
So, in short, if the complex wave f(x)
includes something like sin(3x + 50)
, our formula wonβt be able to extract that from the complex wave. Our function is only able to detect sin(Οx)
as you can recall, not sin(Οx + a)
.
To be able to inspect all kinds of waves, pure sine is not enough on its own.
But, good news: sine and cosine are orthogonal to each other. Which means in basic English: with a combination of cosine and sine, we are able to form any periodic wave. It is possible to denote every periodic wave/function with:
$f(x) = \int_{\omega=-\infty}^{\infty} A(\omega) \cos(\omega x) + B(\omega) \sin (\omega x)$Where A(Ο)
and B(Ο)
are the coefficients of the linear combination for the Ο
th basis vectors (the cos and sin).
The above equation translates into: if you add various cosine and sine waves together, with different coefficients, you can get any periodic function you like.
Iβll not dig any more deeply into the mathematics of how sine
and cosine
are orthogonal to each other, and why using both of them grants us the ability to form any periodic function. Answering these questions will be diverging from Fourier Transform at this point.
You probably see where Iβm arriving at: using both sine
and cosine
in our equation will allow us to extract the phase information. How? We will see below.
Since sine
and cosine
are both basis vectors, we could as well have used cosine
till now instead of sine
, and every intuition we had so far would be the same. But as sine
is not enough on its own, cosine
is not enough either.
Recall the equation:
$f(x) = \int_{\omega=-\infty}^{\infty} A(\omega) \cos(\omega x) + B(\omega) \sin (\omega x)$which can also be written as:
$f(x) = \int_{\omega=-\infty}^{\infty} A(\omega) \cos(\omega x) + \int_{\omega=-\infty}^{\infty} B(\omega) \sin (\omega x)$Now, it should be clear: if we compute two different versions of our formula with sine
and cosine
, we should be able to extract every piece of information from the complex function f(x)
.
Because, f(x)
is nothing but a combination of various cosx
and sinx
in the end.
OUTCOME:
sine
and cosine
waves in the end.We need to inspect our complex
wave for both sine
and cosine
waves to be able to fully inspect it:
e^i
In the previous step, we ended up having 2 formulas (one for sine
, one for cosine
). Thanks to math, there is a more compact way.
There is this thing called Euler's formula
:
Again, the intuition behind why Euler's Formula
holds is a different topic than Fourierβs Transform. And it is not a blocker for understanding Fourier Transform. So I wonβt be explaining that. But in case you are like me, here are two great resources that fulfilled my curiosity:
[What is Eulerβs formula actually saying? | Ep. 4 Lockdown live math](https://www.youtube.com/watch?v=ZxYOEwM6Wbk) |
i
here in the formula?!Remember, i
is for imaginary, and when you multiply a number with i
, you basically transfer that number to the imaginary realm (a new dimension).
In short, i
is granting us a new dimension, turning our function from being 1 dimensional to 2 dimensional. This way, our cosx
and sinx
do not collide with each other, and we can inspect both of them in a single formula! Actually, the intuition behind i
was that simple. No need to be scared at all.
There are other great perspectives about i
; for example, watch from the timestamp (warning: you might get confused): But what is the Fourier Transform? A visual introduction.
The most upvoted answer to a question I asked on reddit, provided another great take on cosx + isinx
. So I will directly share it here (shoutout to zarek911
for his amazing answer):
Reddit answer begins
By using cosΟx
and sinΟx
alone, we have 2 different βtypesβ of basis vectors in the same basis and for that reason we need to have 2 different component functions A(Ο)
and B(Ο)
. To simplify this we can change the basis into the set of cosΟx + isinΟx
and cosΟx - isinΟx
, still for all Ο
from 0 to infinity. Conveniently these can be expressed using Eulerβs formula, the first being:
and the second being:
$e^{-iwx}$And this is no information lost from our cos and sin basis since you can always recover them with:
$\cos(wx) = \frac{(e^{iwx} + e^{-iwx})}{2}$and
$\sin(wx) = \frac{(e^{iwx} - e^{-iwx})}{2i}$So at first, it seems like you still have 2 types of basis vectors and you still need 2 different component functions, but you can see that the 2nd type of basis vector:
$e^{-iwx}$is just the first:
$e^{iwx}$with a negative value of Ο
. So you can actually combine these 2 different types into one type by letting Ο
run from $-\infty$ to $\infty$. So now the linear combination would have a single component function A(Ο)
and would look like:
Reddit answer ends
OUTCOME:
We need to inspect both sine
and cosine
, and hence we needed two formulas:
By introducing another dimension with i
, we can now use sine
and cosine
in the same formula:
Which is equal to:
$f(x) = \int_{-\infty}^{\infty} A(\omega) e^{i\omega x}$-
in the original formula instead of +
?In the original formula:
$A(\omega) = \int_{-\infty}^{+\infty} f(x)e^{-i\omega x}dx$there is a -
sign in the exponent as you can see. And if we remember Euler's Formula
:
then, we know that our original formula for Fourier Transform actually corresponds to:
$A(\omega) = \int_{-\infty}^{+\infty} f(x)(\cos(\omega x) -i\sin(\omega x))dx$But waitβ¦ Previously, we used cosx + isinx
to represent both cosx
and sinx
in the same formula. So, how did this +
turn into a -
?
Here is where everything will come together and make even more sense. Letβs go back a bit and have a quick recap. It will be helpful.
Constructing any function from sine
and cosine
:
We can also approach the same thing from the perspective that zarek911
provided (merging cosx
and sinx
using an additional dimension with i
, so that they wonβt overlap):
If we try to explain the above formula in basic English, it would go like this (very roughly):
Multiply some frequencies with cosx
and sinx
, and sum intermediate results to form a complex wave.
The above formula is also equal to:
$f(x) = \int_{-\infty}^{\infty} A(\omega) e^{i\omega x}$Did you notice something? We are doing the opposite of Fourier Transform. We are composing a complex wave from basic waves. Fourier Transform decomposes a complex wave into basic waves, and inverse Fourier Transform composes waves into a complex one. We just figured out the inverse Fourier Transform!
Although the name implies that Fourier Transform is more intuitive than inverse Fourier Transform
, I believe it is not the case. Making a cocktail is simpler to understand than figuring out the ingredients of a cocktail.
Alright! We just need to inverse the inverse Fourier Transform
thenβ¦ And we should have a normal Fourier Transform
. Letβs go! And it will be very easy actually.
Here is a great analogy Iβm stealing from here:
Say, you have 5$ banknotes, and someone asks you to put 3 of them together. Now you have, 3x5=15$ as a pile.
In this analogy:
cosΟt + isinΟt
(the base components)A(Ο)
(the amplitudes / amount)f(x)
(the complex wave)What we have done is inverse Fourier Transform
. To compose waves, we are multiplying waves and coefficients (amplitudes) together.
Referring to the primitive version of our formula again:
$f(x) = \int_{-\infty}^{\infty} A(\omega) \cos\omega x + B(\omega) \sin \omega x$We multiplied A(Ο)
(amplitude) with cosΟx
(wave), and we summed up these multiplications (the same goes for sine
).
To decompose waves, we will divide the complex wave by basic waves and get the amplitudes (coefficients).
Say, there is a pile of 15$ dollars, and you need to figure out how many 5$ banknotes are there potentially. What would you do? You divide 15
by 5
, and get 3
as a result.
Recall, when we were figuring out the first versions of our formula:
$A(\omega) = \int f(x)\sin(\omega x)dx$we multiplied f(x)
by sin(x)
, because we needed the signs to be the same. And I mentioned back then, division
would have worked just fine. Now, if you think of Fourier Transform (we are decomposing waves), and connect the pieces with the analogy I provided above, division
makes a lot more sense than multiplication
. So:
Yeah. This formula would also do.
So, why did I provide you the multiplication version in the beginning? Why did I bother explaining multiplication is also working?
Because, the important part was to detect if the signs were matching amongst two different waves. It was NOT that βdivision works because it makes sense to divide the complex function by the basic wavesβ. If that was the case, multiplication wouldnβt work.
Imagine this:
so, it would be like this:
again, take a look at our latest formula for inverse Fourier Transform
:
we are multiplying the amplitudes A(Ο)
, with basic waves e^iΟt
. Now, we got to divide the complex wave f(x)
with our basic waves e^iΟt
.
this is equal to:
$A(\omega) = \int_{-\infty}^{\infty} f(x)e^{-i\omega x}dx$and of course, this is equal to:
$A(\omega) = \int_{-\infty}^{\infty} f(x)(\cos(\omega x) - i\sin(\omega x))dx$There you goβ¦ Here is your -
instead of +
:)
Weβve done it! That was it for the Fourier Transform.
OUTCOME:
To be able to know how much of a candidate
wave there is inside complex
wave, we have to divide complex
by candidate
:
Which is equal to:
$A(\omega) = \int_{-\infty}^{\infty} f(x)e^{-i\omega x}dx$image taken from: NTI Audio
Remember from this image, Fourier transform is going from the time-domain to the frequency-domain. Letβs change the letters in our notation accordingly, so that it will make more sense.
This:
$A(\omega) = \int_{-\infty}^{\infty} f(x)e^{-i\omega x}dx$is now this:
$\hat{g}(f) = \int_{-\infty}^{\infty} g(t)e^{-ift}dt$The changes:
x
β t
(since we are representing time with this letter)Ο
β f
(since we are representing frequency with this letter)f
β g
(since now, we have another f
(frequency), so letβs change this)A
β Δ
(to make it related to the new g
)The formula is still the same, the intuitions are the same. We just changed letters :)
We could also add 2Ο
to our formula. Why? Because of three reasons:
As humans, we like the most basic things to be equal to 1
. So, the most basic period should be 1
. How do we achieve 1
with cosine
and sine
? With 2Ο
of course! Letβs see the differences between sinx
and sinΟx
below from Wolfram Alpha:
Basically, we have three good reasons to do this, and not a good one against it as far as I know. So, letβs do it!
Ladies and gentlemen, I present you the final form of our Fourier Transform!
$\hat{g}(f) = \int_{-\infty}^{\infty} g(t)e^{-i2\pi ft}dt$Weβve done it! That was it for the Fourier Transform. Fortunately (or unfortunately) we still have more to do. First, we need to come up with a discrete version of the Fourier Transform.
But for simplicity, I will use the version without 2Ο
. We will see in the next part, there are even better arguments to include 2Ο
when we are dealing with both DFT and FFT.
OUTCOME:
Introducing 2Ο
to our formula is another small addition (yet not that important):
To have more meaningful letters and to follow the convention, we can also denote our formula as:
$\hat{g}(f) = \int_{-\infty}^{\infty} g(t)e^{-i2\pi ft}dt$In the real world, electronic sensors do not capture values continuously. What I mean is, they have precision. For example, a sensorβs precision could be milliseconds, so it is recording data for each millisecond. This means, our data at hand is discrete, not continuous. We cannot zoom in indefinitely; there is a precision limit.
Discrete Fourier Transform is the same as Fourier Transform but allows us to compute the same stuff on discrete data instead of continuous data (for example, using summation of data points instead of taking the integral of a continuous function).
Itβs much more realistic. We want to analyze a complex wave function. We only have some data points that these complex wave generated. We will input these data points to DFT, and it will output the ingredients of the complex function.
So, how do we turn our continuous function into a discrete one?
First, we turn our integral
into a sum
. The integral is the continuous form of the $\sum$ symbol. One adds all points, the other one adds discrete points.
will turn into:
$\hat{g}(f) = \sum_{t = -\infty}^{\infty} g(t)e^{-ift}$But, the ranges do not make sense now. We donβt have infinities in discrete form (or in real life). We will have a finite point of data to evaluate. So, our range will be, from the first point, till the last point. Let n
denote a point, and let N
be the total amount of data points we have.
Previously, we were interpreting f
(frequency) and t
(time) continuously. Well, guess what, we canβt now.
image taken from: geeksforgeeks
This image represents the output of DFT. Just like FFT, we have 2 outputs:
In the frequency-magnitude plot, we have 9 discrete points (0,1,2,3,4,5,6,7,8) for frequency in this example. The letter k
will denote these points. Be careful not to confuse k
and n
.
k
is the frequency we are trying to find out (for example, say there is sin2x
as a component. 2
here is the k
).n
is the data point we have available for the function we are analyzing.n
and k
are bounded by N
(the total amount of data points).n
is bounded by N
because n
is the data point, and we have N
data points in total (duh).k
is also bounded by N
because we cannot inspect a frequency that is bigger than the amount of our data points. There is a theorem about this that I will not go into details of (NyquistβShannon sampling theorem), but if you are familiar with polynomials, this should make sense. N+1 points required to specify an N degree polynomial.Simply, f
(continuous frequency) becomes k
(discrete frequency). Nothing much is happening here, just a change of letter. We could have kept the old letter, but letβs just follow the convention. When we use f
, we should understand a continuous frequency; whereas when we use k
, we should understand a frequency bin (as presented in the above image).
We cannot represent t
(all the points possible, or time) in a discrete setting. Instead of infinite points, we have finite points (which are denoted by n
).
We need to be compact, and select the minimum amount of points, and use these points in our formula.
Recall our basis vectors (sine
and cosine
). Their period is 2Ο
. And we have N
points in total. So, dividing 2Ο
into N
points would make the most sense for evaluating our sine
and cosine
functions (previously, we were evaluating them at all points).
Notice that we actually also changed g(t)
into g(n2Ο/N)
. But to be compact, we denoted it with the nth
element of the series g
. So, we are consistent :)
Well, there you go:
OUTCOME:
We had this:
$\hat{g}(f) = \int_{-\infty}^{\infty} g(t)e^{-ift}dt$We turned the integral into a sum:
$\hat{g}(f) = \int_{-\infty}^{\infty} g(t)e^{-ift}dt$We turned infinite bounds into finite bounds:
$\hat{g}(f) = \sum_{n=1}^{N} g(t)e^{-ift}$Changed the letter f
to k
to denote discrete frequency:
We canβt have continuous time. So, changed t
to 2Ο/N
(where 2Ο
is the frequency period, and N
is the total amount of data points, effectively becoming points in time):
Denoted g(n2Ο/N)
more compactly (due to convention) as nth
element of series g
:
When it comes to math, especially in the form of a video about math, people tend to take it as a fact.
We are taught not to believe everything we see on the internet. However, the scope of skepticism is often narrowed down to not trusting comments and social content. Scientific papers or videos, on the other hand, are often unquestionably accepted.
In this video, it is explained:
f
β k/N
n
β t
In the comments, people express their gratitude towards the individual who created the video.
However, Iβve spent many hours trying to understand the reasons for these replacements:
f
β k/N
n
β t
This explanation also contradicted some of the approaches presented in other resources (for example: Oxford Robotics - Signal Processing Lecture 7).
In short, this video caused me to lose many hours, especially as a newbie to the concept. Fortunately, I questioned this part along with every other thing he said.
I believe most of the audience in this video simply accepted the above replacement as a fact. If I were to ask them:
f
β k/N
?f
with k/N
in the exponent, but not in the X(f)
part?They probably wouldnβt have a good answer. They did not grasp the intuition; they merely memorized it. I, too, fell into this trap when I first saw this video.
So, even for me, it was easy to be lazy and succumb to this trap. I believe this is a great example of why you should question even the scientific information you encounter on the internet.
Now that Iβve fulfilled my social duty as well, letβs go back to the Fourier Transform!