Thank you for this very helpful explanation. Do you have the possibility to compare the formants of two voices, recorded in very different situations, and establish if the two individuals are related to each other. I have read some academic articles that explain the hereditarity of the voice apparatus and, then, of some aspects of the voice.
It is true that related people have voices that sound similar, and that no doubt has to do with similarly shaped vocal tracts- and therefore formants. However, I doubt that the shape of two peoples' vocal tracts alone could be enough to establish relationship, let alone two peoples' characteristic formants, which can be manipulated by the speaker. An interesting thought though! Perhaps an AI would be able to identify relationships between people through their voices some day...
This was a super useful explanation! Very clearly demonstrated and in many different ways. A concept I could hear but did not understand how it worked or how to manipulate it.
Dude whispering = passing noise through formant filters is mindblowing and is a perfect way to drive the point that the formant is independent from the pitch of the sound. Thank you so much for this video!
I was so frustrated trying to find a video that explained formants, and no video I found gave an actual definition or explanation of what a formant was. Even if others could define it, I need to see the actual science behind it. I need to understand how it applies to actual speech, not just in theory. In the literal first 10 seconds, you defined it, AND explained it, something NO ONE ELSE could do. Thanks from a speech therapy and audiology student!
2:33 and 3:55 shows that regions around 100 and 200 Hz are not changing despite uttering different vowels. This implies that they should not be taken as "formants". These regions are rather due to the source, not because of the vocal tract. My question of interest is, "How do we pick F1 from all these peaks". To me it seems that we will have to first, manually analyze several "different vowels" spectrograms to get to know peaks that are NOT formants, just by looking at the peaks that are not changing for all the spectrograms. Second, manually analyze "same vowel" spectrograms and check for the peaks that are not changing (excluding those found during first analysis) for all the same-vowel spectrograms. The unchanging peaks in second analysis, must be the F1, F2 and so on (for that vowel alone).
Interesting, I could see how they're related to accent. The placements of formants give each vowel is characteristic sound, and if one accent's vowels sound different from another, you could analyze that using formants!
Formants are natural 'chords' (frequencies) that the voice makes . Like piano chords, *formant frequencies* have specific and consistent interval lengths between them that aid the identification of them regardless of where they fall on the spectrogram.
I think you're confusing formants for overtones/harmonics. There is a set of intervals between overtones (the harmonic series) that stays the same no matter where they fall on the spectrogram- like you said. Formants are ranges of frequencies where overtones are amplified, not the overtones themselves.
Im not the only one to say this but absolutely the best tutorial on RUclips! a lot of the times RUclipsrs try to come off as these all knowing Gods and the important info gets lost in their explanation. this was just great.
Excellent explanation. I've wanted to get that Daft Punk formant sound (that was apparently done with a Digitech Bass Synth Wah), I will give EQ automation a try, and maybe get to learn what frequencies make various vowels in the process.
@@seanthomasmartin2184 Yes they used vocoders a lot and certainly are associated with that sound, the sound I was meaning to refer to was the bassline such as: ruclips.net/video/D8K90hX4PrE/видео.html has that "Yai, yai" sound.. I'm pretty sure that's formant going from Y to A to I. Yes, programming the automation will be painstakingly slow and there are faster ways (such as the bass synth wah pedal I think they used), but manual somehow sounds fun and interesting to me, your video definitely gave me some insight to this.
@@tjn0110 Ah! Now I know what sound you're talking about- you're totally right! I've chased that sound for a while myself, I had some success with using a very resonant bandpass filter and then a bitcrusher (I think at about 2khz). Now that I think about it, that bandpass filter is very formant like... Also I totally get wanting to do it manually, you learn a lot doing things that way. Good luck!
Yes sort of lol... all sounds in nature are combinations of different frequencies, and in that way all sounds are like chords. In this analogy, vowel sounds would be like balancing the notes of the chord in different ways.
Is your frequency curve set to 0? Or is it set to a slope like most are. Because if it is, you would better take that off to see the true amplitude of the formants relative to the rest of the spectrum.
Thank you for this very helpful explanation. Do you have the possibility to compare the formants of two voices, recorded in very different situations, and establish if the two individuals are related to each other. I have read some academic articles that explain the hereditarity of the voice apparatus and, then, of some aspects of the voice.
It is true that related people have voices that sound similar, and that no doubt has to do with similarly shaped vocal tracts- and therefore formants. However, I doubt that the shape of two peoples' vocal tracts alone could be enough to establish relationship, let alone two peoples' characteristic formants, which can be manipulated by the speaker. An interesting thought though! Perhaps an AI would be able to identify relationships between people through their voices some day...
@@seanthomasmartin2184 thank you for this answer!
mindblown
Perfect explanation. I being an engineer, I really love to learn singing by understanding at technical level.
Great! Which sound did you use in your synthesizer?
Excellent Demo and Explanation !!
Thanks for the explanation with graphs and physics.
This was a super useful explanation! Very clearly demonstrated and in many different ways. A concept I could hear but did not understand how it worked or how to manipulate it.
That's so cool 🤓🤓🤓🤓🤓
Excellent! Thank you.
dude made a banger tutorial and then left anyway super well explained, thanks for this video!!
<3
ooooh so that's how synths like Delay Lama replicate human voice. Seems so obvious now
time to create vowels that will never exist, thanks for the video my brother
this is the best video
Perfect, thanks!
This explanation was eye opening. Thanks :)
Thank you
This video is a load of vowellshit! ;-)
Dude whispering = passing noise through formant filters is mindblowing and is a perfect way to drive the point that the formant is independent from the pitch of the sound. Thank you so much for this video!
I have played around a lot with sound and spectrums using audacity. Now I will start adding observations with sonic visualizer as well.
Amazing explanation man! Thank you for this
incredible video thank you
I was so frustrated trying to find a video that explained formants, and no video I found gave an actual definition or explanation of what a formant was. Even if others could define it, I need to see the actual science behind it. I need to understand how it applies to actual speech, not just in theory. In the literal first 10 seconds, you defined it, AND explained it, something NO ONE ELSE could do. Thanks from a speech therapy and audiology student!
Very happy to have been so helpful for you :)
Soo disappointed with only one video, you have got to do more
What would you like to see a video on?
did not expect sudden comedy gold towards the end. also thanks. now i know how to make a homunculus inside ableton's analog synth.
2:33 and 3:55 shows that regions around 100 and 200 Hz are not changing despite uttering different vowels. This implies that they should not be taken as "formants". These regions are rather due to the source, not because of the vocal tract. My question of interest is, "How do we pick F1 from all these peaks". To me it seems that we will have to first, manually analyze several "different vowels" spectrograms to get to know peaks that are NOT formants, just by looking at the peaks that are not changing for all the spectrograms. Second, manually analyze "same vowel" spectrograms and check for the peaks that are not changing (excluding those found during first analysis) for all the same-vowel spectrograms. The unchanging peaks in second analysis, must be the F1, F2 and so on (for that vowel alone).
Thanks from Empalme, Sonora, México.
Very good explanation! I’ve been singing for quite a while but finally trying to understand this concept and this helped a lot. Thank you!
very well taught
Incredible video. Thank you! 🙏
So cool youre explained it so well and interestingly
What are two applications in the video for synthesising sound? I like your voice!
Oh my god thank you so much! Finally I understood Formants
I thought that formants related to accent.
Interesting, I could see how they're related to accent. The placements of formants give each vowel is characteristic sound, and if one accent's vowels sound different from another, you could analyze that using formants!
Formants are natural 'chords' (frequencies) that the voice makes . Like piano chords, *formant frequencies* have specific and consistent interval lengths between them that aid the identification of them regardless of where they fall on the spectrogram.
I think you're confusing formants for overtones/harmonics. There is a set of intervals between overtones (the harmonic series) that stays the same no matter where they fall on the spectrogram- like you said. Formants are ranges of frequencies where overtones are amplified, not the overtones themselves.
I see people finding this video useful in different domains, this is fascinating. I found it mind opening also, thank you a lot!
Awesome explanation
mf dropped the best video on speech acoustics then fucking disappeared
Lmao its true tho, my b. I do want to make more- what would you like a video on if you could choose?
Im not the only one to say this but absolutely the best tutorial on RUclips! a lot of the times RUclipsrs try to come off as these all knowing Gods and the important info gets lost in their explanation. this was just great.
best vid about formant
Really good video
Brilliant explanation, thanks
Great video!
Thanks
Excellent explanation. I've wanted to get that Daft Punk formant sound (that was apparently done with a Digitech Bass Synth Wah), I will give EQ automation a try, and maybe get to learn what frequencies make various vowels in the process.
Thanks! I would recommend a vocoder for the daft punk voice, which is effectively EQ automation, but you don't have to do it manually.
@@seanthomasmartin2184 Yes they used vocoders a lot and certainly are associated with that sound, the sound I was meaning to refer to was the bassline such as: ruclips.net/video/D8K90hX4PrE/видео.html has that "Yai, yai" sound.. I'm pretty sure that's formant going from Y to A to I. Yes, programming the automation will be painstakingly slow and there are faster ways (such as the bass synth wah pedal I think they used), but manual somehow sounds fun and interesting to me, your video definitely gave me some insight to this.
@@tjn0110 Ah! Now I know what sound you're talking about- you're totally right! I've chased that sound for a while myself, I had some success with using a very resonant bandpass filter and then a bitcrusher (I think at about 2khz). Now that I think about it, that bandpass filter is very formant like... Also I totally get wanting to do it manually, you learn a lot doing things that way. Good luck!
so a vowel sound is actually a chord produced by your throat...sort of??
Yes sort of lol... all sounds in nature are combinations of different frequencies, and in that way all sounds are like chords. In this analogy, vowel sounds would be like balancing the notes of the chord in different ways.
great video, thanks!
Perfect video. Bravo.
Is your frequency curve set to 0? Or is it set to a slope like most are. Because if it is, you would better take that off to see the true amplitude of the formants relative to the rest of the spectrum.
Dude you explained it so well, and I laughed so so hard at the end when you kept explaining things demonstrating them with da vocoder on
Great demonstration