How to Extract the Fourier Transform with Python

Valerio Velardo - The Sound of AI

Просмотров 43 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 июл 2024
Learn how to extract the Fourier Transform from an audio file with Python and Numpy. I also visualise and compare the magnitude spectra of the same note played on different musical instruments.
Code:
github.com/musikalkemist/Audi...
Join The Sound Of AI Slack community:
valeriovelardo.com/the-sound-...
Interested in hiring me as a consultant/freelancer?
valeriovelardo.com/
Follow Valerio on Facebook:
/ thesoundofai
Connect with Valerio on Linkedin:
/ valeriovelardo
Follow Valerio on Twitter:
/ musikalkemist
Наука

Комментарии • 62

@asfandiyar5829 8 месяцев назад ⁺²
Really cool stuff! Can't wait to finish the series.
@qingyuliu1176 Год назад
Good Video!!! I've learned a lot in this video and written many notes and codes these days. As an EE student, it helps me understand many courses better such as Signals & Systems and DSP.
@MrDari88 3 года назад ⁺¹⁸
Amazing material amazingly explained. You are doing great, Valerio. Thanks for making audio processing fun and easy.
@ValerioVelardoTheSoundofAI 3 года назад ⁺¹
I'm happy you like the series!
@kalagaarun9638 3 года назад ⁺¹²
really a life saver... thanks for uploading this sir 👏🏻
keep posting great stuff like these 🙌🏻
@ValerioVelardoTheSoundofAI 3 года назад ⁺²
Thank you Kalaga!
@i_am-ki_m 2 года назад
So intuitive, potential but since basics and nicely video. Go ahead!
@wisteriahu4551 2 года назад ⁺²
I am doing my IB Math final IA on this and had no idea how to code, this really clear and extremely well explained video literally saved my life, thank you so much
@shivenlak Год назад
Do you have any tips, I am doing my IA on this right now. Just starting, confused, please help!
@dimitrijemitic497 3 года назад ⁺²
Hi Valerio, great video as always, you described the topic in a interesting and intuitive way...it would be great if you can explain Mel Spectrograms and MFCCs the same way you explained Fourier Transform :D
@ValerioVelardoTheSoundofAI 3 года назад
Thank you! I'm planning on doing that after I've tackled the Short-Time Fourier Transform over the next couple of videos.
@likestomeasurestuff3554 2 года назад
Thank you, very helpful!
@hoomansaadatmand829 Месяц назад
perfect video. great explanation.
@jimcrowjoe451 2 года назад ⁺¹
Thanks Valerio. The frequency is accurate but how do you get the correct amplitude when doing the FFT?
@omidbagheri9159 2 года назад
Thanks for your valuable tranning. I really appriciate it.
I have a question about enviroment sound processing. I want to seprate signals fromenviroment and recognize what we heared.
Can you show me a roadmap to find out the solution?
Thanks
@user-co6pu8zv3v Год назад
Thank you!
@chahinezhigoun1078 3 года назад ⁺¹
Thanks for the video
@ValerioVelardoTheSoundofAI 3 года назад
You're welcome!
@michaelswanson3162 3 года назад
cant you normalize the output of the fft so that the magnitude spectra levels match the time signal? something like this:
Xmag = 2 * np.abs(ft) / N # normalization by 2/N so that the magnitude spectrum shows the estimated amplitudes of the input signal where I think N = length of signal in samples.
I saw this in a different video, maybe you can elaborate? thanks! really helpful videos
@AminKiany 3 года назад
Hallo Valerio,
Thanks for the useful material. I was trying to follow your code step by step. At 7':25'' you were explaining the dimension of the short time Fourier Transform. The first dimension is the #frequency bins = (Frame Size)/2 + 1 which is inline with what librosa returns. However, for the seconde dimension, I got (174943 - 2048)/512 + 1 = 339 which is not equal to 342 derived from the output of the librosa SFTT. (here 174943 is number of samples, 2048 is the frame size, and 512 is the hop size). Do you have any comment about this discrepancy? Or I miss something here?
Thank you in advance
@prabhuramnagarajan1893 Год назад
(174943 )/512 + 1 equal to 342
@godfather_1994 3 года назад
how can i compare two spectrums from a sound of a word that i've recorded and from an extracted sound from a video and see how many times the word that i've said is on the video ? i think coherence is the solution
@Underscore_1234 2 месяца назад
Definitly useful, I was familiar with the theory but didn't know the libraries. I'm gonna check out your video for the spectrograms! (you got a new subscription here!)
Although, in the theory, (if I remember well), the fourier transform phase is related to the phase shift of each sinusoids. Do you know any application where it is used or is it always tossed away? (for instance, is the magnitude distribution enough to recognize an instrument or does the phase help?)
@ValerioVelardoTheSoundofAI 2 месяца назад
Magnitude is enough for analysis, Phase is necessary for audio generation.
@letsplaionline 3 года назад
Hi Valerio,
Personally I find your work very helpful so I wanna thank you very much for all that you've been doing.
I have a question about load.librosa. I realized you didn't explicitly set the sampling rate to 44100 which is the sr for the audio files used in the notebook and the default value for sr in the function is 22050. I want to ask if you did it on purpose as it makes no difference or it is just because you didn't pay attention to it?
@ValerioVelardoTheSoundofAI 3 года назад
Good catch Imece :) 22050Hz is a reasonable default to use in audio/music processing, so I didn't bother specifying a custom sampling rate. You should definitely treat this value as a hyper-parameter, when you're optimising ML algorithms.
@nezardasan5015 3 года назад ⁺¹
Hallo Valerio, thank you very much for your usful work its realy great, my quetion is: why you use in numpy for fourier np.fft.fft and not only use np.fft jus once...?
@zero4433 3 года назад
He explain this at 8:26
@alpcnar5877 2 года назад
how can you determine the frequency axis: np.linspace(0,sr,len(magnitude_spectrum) , how can you know this is right? np.linspace(1000,4000,len(magnitude_spectrum) ? gives the same , why we start with 0 and end with sr ?
@canernm 3 года назад
Thank you for the videos. Something cofuses me though, i'd really appreciate it if you could give me an advice here: the audios we load with librosa.load(), returns an array which i suppose is the audio signal. However, is this a digital signal? Has it already undergone the process of sampling and quantization? Thanks!
@ValerioVelardoTheSoundofAI 3 года назад ⁺²
Yes, with librosa.load() you get back the waveform and the sample rate.
The starting audio file (wav or mp3) is already sampled and quantised. In other words the file has already undergone the process of sampling / quantisation.
When using librosa.load() it is possible to re-sample the signal. If you don't pass any sample rate, the signal will be converted to 22050Hz.
@canernm 3 года назад
@@ValerioVelardoTheSoundofAI thanks a lot. I realize now that what I said was a bit nonsense. Since we work with computer , the signal would of course be digital! Thanks a lot
@ajayshriram9186 2 года назад
Does any know at 7:30, what the number 4 signifies (in 'violin_c4').
Same goes with the number 5 in 'piano_c5'.
I am really sorry to ask, I am a complete beginner to coding.
@saurabhdeshmukh2182 8 месяцев назад
Valerio, can you please explain:
why is the len(magnitude_spectrum_violin) is equal to 59,772
I thought it should be equal to sample rate. because there should be one complex number for every frequency
@hoomansaadatmand829 Месяц назад
I am browsing about FFT to figure out if it works for my thesis or not. I have temperature history for about 2 hours in 3d printers( metal => DED). I was looking for benefits from FFT to predict final residual stress. do you think could FFT be related to my topic? Thanks
@alpcnar5877 2 года назад
Hey N is not power of 2 ? can you help ?
@fardalakter4395 10 месяцев назад
sir or everyone, i have question. in around 9:00, 59772 is not the power of 2 which fft would be more efficient, is that ok ? why don't we use dft ?
@frederiksidenius Год назад
Thanks a lot for the great videos!
They are well-made and very informative, however, I can’t help noticing that your frequency axis is wrong. It’s a minor inaccuracy but the way you define frequency with ‘np.linspace(0, sr, len(magnitude_spectrum))’ is wrong. The frequency resolution should be ‘sr/len(magnitude_spectrum)’ which you can achieve for example either with ‘np.linspace(0, sr, len(magnitude_spectrum) + 1)[:-1]’ or ‘np.linspace(0, sr - sr/len(magnitude_spectrum), len(magnitude_spectrum))’. In short, the DFT returns N frequency bins from 0 to N-1 and therefore no bin is equal to the sample rate.
As I said it is a minor error, however, this would be the accurate way to do it.
@ektabajaj1683 3 года назад ⁺¹
Valerio, can you please explain how to extract fundamental frequency taking the help of fourier transform using python i.e. I mean how to check the peak values....
@ValerioVelardoTheSoundofAI 3 года назад ⁺¹
I'll definitely cover pitch detection in the future. Stay tuned :)
@ektabajaj1683 3 года назад
@@ValerioVelardoTheSoundofAI thank you.
@user-ul1oc9gl9d 3 месяца назад
Hi, please give notation meaning in the notes with formulae. It would be more convenient to understand
@hardypatel4665 3 года назад ⁺³
Sir, why magnitude spectrum mirrors after nyquist frequency?
@ValerioVelardoTheSoundofAI 3 года назад ⁺¹
This is a bit of a tricky topic that warrants some time to explain. I suggest you to check out this resource that provides a thorough explanation www-elsa.physik.uni-bonn.de/~dieckman/DFT/DFT.html In a nutshell, the complex numbers in the right side of the spectrum are complex conjugates of those in the left side. This determines the typical mirror symmetry we see in the examples.
@hardypatel4665 3 года назад
@@ValerioVelardoTheSoundofAI Thank you Sir :)
@piasroy3629 3 года назад
what is frequency bins? is it a range or a single frequency that is equally distributed in the (0, sample_rate) range?
@ValerioVelardoTheSoundofAI 3 года назад
It's a range of frequencies.
@piasroy3629 3 года назад
@@ValerioVelardoTheSoundofAI so the size of each frequency bins is len(magnitude_spectrum)?
@tuffCOOKIEanimations 2 года назад ⁺¹
Can you show how to do this on signal express?
@rhwood1154 2 года назад ⁺¹
OMG yes that would be amazing!
@Bigman74066 3 года назад
I would have liked it if you'd manually calculate the first few discrete fourier samples. Just to get a feel for the algorithm...
@ValerioVelardoTheSoundofAI 3 года назад
I think I do something similar in previous videos.
@Bigman74066 3 года назад
@@ValerioVelardoTheSoundofAI Can't remember you did... Anyway, just saying... Excellent work
@ValerioVelardoTheSoundofAI 3 года назад
@@Bigman74066 I thought I did -- but honestly I can't remember :)
@1waveOrg 3 месяца назад
DAPZZ DUUD
Freakin’ cleannn
#Swypeddddd🤓
Heheh ;^P
@Tamara26613 3 года назад
! 💖
@ShortenDavid 2 года назад
Lol @ 0:48
@samirelzein1978 3 года назад
man, just avoid us all the typing and copy and paste and just comment on the existing code! you wanna make it shortest, not add boring times.... otherwise I am sure the content is good but I am leaving less than 5 mins thru!
@ValerioVelardoTheSoundofAI 3 года назад ⁺³
Thank you for the feedback.
Pedagogy-wise, I find there's more value for the learner in typing line-by-line than commenting on already-written code.
@SaucyLimit 2 года назад
I disagree with this - hearing what we're doing be explained as each line is typed is super helpful. The pace was good
@yunfan7034 Год назад
Hi
Does anyone know when using numpy.fft.fft to get fourier frequency,
How does numpy.fft.fft know what sampling rate is to calculate the frequency(Hertz) range?
For example, if I have 10 data points
1) for 10 seconds long audio, so the sampling rate is 1 sample/sec, and the the frequency(Hertz) range would be 0, 1/10, 2/10...10/10 Hertz
2)for 5 seconds long audio, so the sampling rate is 2 sample/sec, and the the frequency(Hertz) range would be 0, 1/5, 2/5, 3/5...10/5 Hertz
How does numpy.fft.fft know what sample rate is to calculate frequency(Hertz) range?
I check the document, not found any parameter about default sampling rate
Thank you

Следующие

Автовоспроизведение

Short-Time Fourier Transform Explained Easily