Gaussian Naive Bayes, Clearly Explained!!!
HTML-код
- Опубликовано: 5 июл 2024
- Gaussian Naive Bayes takes are of all your Naive Bayes needs when your training data are continuous. If that sounds fancy, don't sweat it! This StatQuest will clear up all your doubts in a jiffy!
NOTE: This StatQuest assumes that you are already familiar with...
Multinomial Naive Bayes: • Naive Bayes, Clearly E...
The Log Function: • Logs (logarithms), Cle...
The Normal Distribution: • The Normal Distributio...
The difference between Probability and Likelihood: • Probability is not Lik...
Cross Validation: • Machine Learning Funda...
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying my book, The StatQuest Illustrated Guide to Machine Learning:
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
RUclips Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
1:00 Creating Gaussian distributions from Training Data
2:34 Classification example
4:46 Underflow and Log() function
7:27 Some variables have more say than others
Corrections:
3:42 I said 10 grams of popcorn, but I should have said 20 grams of popcorn given that they love Troll 2.
#statquest #naivebayes
NOTE: This StatQuest is sponsored by JADBIO. Just Add Data, and their automatic machine learning algorithms will do all of the work for you. For more details, see: bit.ly/3bxtheb BAM!
Corrections:
3:42 I said 10 grams of popcorn, but I should have said 20 grams of popcorn given that they love Troll 2.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
website not working?
@@phildegreat Thanks! The site is back up.
8:15 There's a minor error in the slide 'help use decide' .
You really are a great teacher.Wish I could Meet you in person some day.
4 weeks back I had no idea what is machine learning, but your videos have really made a difference in my life, they are all so clearly explained and fun to watch, I just got a job and I mentioned some of the learnings I had from your channel, I am grateful for your contribution in my life.
Happy to help!
Congratulations!!
That is a HUGE help my friend, congrats.. !!
Im at the point where my syllabus does not require me to look into all of this but im just having too much fun learning with you. Im glad i took this course up to find your videos
Hooray! :)
Following your channel for over 6 months now sir, your explanations are truly amazing..
Thank you very much! :)
My little knowledge about machine learning could not be derived without your tutorials. Thank you very much
Glad I could help!
This is by far my favorite educational RUclips channel.
Everything is explained in a simple, practical and fun way.
The videos are full of positive vibes just from the beginning with the silly song entry. I love the catch phrases.
Statquest is addictive!
Thank you very much! :)
I have watched over 2-3 hours of lecture about Gaussian Naive Bayes. Now is when I feel my understanding is complete.
Hooray!
Thank you for the prompt response. I’m fairly new to Stats. But this video prompted me to do a lot more research and I’m finally confident on how you got to the result. Thank you for your videos. They are so helpful
Glad it was helpful!
Thank you Josh. You deserve all the praises. I have been struggling with a lot of the concepts on traditional classic text books as they tend to "jump" quite a lot. You channel brings all of them to life vividly. This is my go to reference source now.
Awesome! I'm glad my videos are helpful.
This is crazy I went to school for Applied Mathematics and it never crossed my mind that what I learned was machine learning as chatgpt came into the lime light I started looking into it and almost everything I've learned so far is basically everything I've learned before but in a different context. My mind is just blown that I was assuming ML was something unattainable for me and it turns out I've been doing it for years
bam!
WOOOOOOW. I watched every video of yours, recommended in the description of this video, and now this video. Everything makes much more sense now. It helped me a lot to undersand the Gaussian Naive Bayes algorithm implemented and available from scikit-learn for applications in machine learning. Just awesome. Thank you!!!
Wow, thanks!
Hi, Josh.
Thank you so much for all the exceptional content from your channel.
Your work is amazing.
I'm a professor in Brazil of Computer Science and ML and your videos have been supporting me a lot.
You're an inspiration for me.
Best.
Muito obrigado!
I am a beginner in Machine Learning field, and your channel helped me alot, almost went through all the videos, very nice way of explaining. Really appreciate you for making these videos and helping everyone. You just saved me ... Thank you very much...
Thank you very much! :)
If I remember all the best educator's name on RUclips, you always come at the beginning! You are a flawless genius!
Thank you! 😃
It's amazing! Thank you so much !
Our professor let us self-teach the Gaussian naive bayes and I absolutely don't understand her slides with many many math equations. Thanks again for your vivid videos !!
Glad it was helpful!
These videos are amazing !!! Truly a survival pack for my DS class👍
Bam! :)
This series is helping me so much with my dissertation, thank you!!
Awesome and good luck with your disertation!
this was the best explanation i've ever seen in my life, (i'm not even a english native speaker, i'm brazilian lol)
Muito obrigado! :)
amazing kowledge with incredible communication skills..world will change if every student has such great teacher
Thank you!
This video on Gaussian Naive Bayes has been very well explained. Thanks a lot.😊
Most welcome 😊
This channel has helped me so much during my studies 🎉
Happy to hear that!
Your videos and voice make ML and statistics fun to learn. :)
Glad you like them!
Sir, this playlist is a one-stop solution for quick interview preparations. Thanks a lot sir.
Good luck with your interviews! :)
One of the best channel for learners that the world can offer..
Thank you!
contents are excellent and also i love your intro quite a lot (its super impressive for me) btw. thanking for doing this at the fisrt place as a beginner some concepts are literally hard to understand but after watching your videos things are a lot better than before. Thanks :)
I'm glad my videos are helpful! :)
Thank you, You have made the theory concrete and visible!
Thanks!
Thank you for another excellent Statquest !~
Bam! :)
Literally the best video ever on this.
Thank you!
You have really helped me a lot. Thanks Sir. May you prosper more and keep helping students who cant afford paid content :)
Thank you! :)
Daym, your videos are so good at explaining complicated ideas!! Like holy shoot, I am going to use this, multiple predictors ideas to figure out the ending of inception, Was it dream, or was it not a dream!
BAM! :)
Your videos are more helpful than my Machine Learning lectures were. Man, you are Gigachad of Machine Learning
Wow, thanks!
Josh. I love you your videos. I've been following your channel for a while. Your videos are absolutely great!
Would you consider covering more of Bayesian statistics in the future?
I'll keep it in mind.
This is the only lecture that makes me feel not stupid...
:)
Great video! If people are willing to spend time on videos like this rather than Tiktok, the wold would be a much better place.
Thank you very much! :)
How do people come up with these crazy ideas? it's amazing, thanks a lot for another fantastic video
Thank you again!
Thank you Josh for another great video! Also, this (and other vids) makes think I should watch Troll 2, just to tick that box.
Ha! Let me know what you think!
Thanks for the video !! it was very helpful and easy to understand
Glad it was helpful!
you explained much clearer than my lecturer in ML lecture.
Thanks!
Hey Josh I hope you are having a wonderful day, I was searching for a video on " Gaussian mixture model " on your channel but couldn't find one, I have a request for that video since the concept is a bit complicated elsewhere
Also btw your videos enabled to get one of the highest scores in the test conducted recently in my college, all thanks to you Josh, you are awesome
Thanks! I'll keep that topic in mind.
Hey Josh, Thank you for making these amazing videos. Please make a video on the "Bayesian Networks" too.
I'll keep it in mind.
The demarcation of topics in the seek bar is useful and helpful. Nice addition.
Glad you liked it. It's a new feature that RUclips just rolled out so I've spent the past day (and will spend the next few days) adding it to my videos.
@@statquest We really appreciate all your dedication into the channel!
It's 100% awesomeness :)
@@anitapallenberg690 Hooray! Thank you! :)
Your video just helped me a lot !
Glad it helped!
Another great tutorial, thank you!
Thanks!
Thanks for the great video!
I would just like to point out that in my opinion if you are talking about log() when the base is e, it is easier (and more correct) to write ln().
In statistics, programming and machine learning, "ln()" is written "log()", so I'm just following the conventions used in the field.
i promise i will join the membership and buy your products when i get a job... BAM!!!
Hooray! Thank you very much for your support!
I'm Having great time watching Ur videos ❤️
Thanks!
BAM! thanks, Josh! It would be amazing if you can make a StatQuest concerning A/B testing :)
It's on the to-do list. :)
So great, this video so helpful
Glad it was helpful!
😅😅😅😅It's the "Shameless Self Promotion" for me... Thank you very much for this channel. Your videos are gold. The way you just know how to explain these hard concepts in a way that 5-year-olds can understand... To think that I just discovered this goldmine this week.
God bless you😇
Thank you very much! :)
These gloriously wierd examples really are needed to understand a concept
Thanks!
Tqsm Sir for the Very Valuable Information
Thanks! :)
Why the fuck does this video make it look so easy and makes 100 percent sense?
Love the explaination BAM!
BAM! :)
The world needs more Joshuas!
Thanks! :)
This channel should have 2.74M subscribers instead of 274K.
One day I hope that happens! :)
well the little intro made me cry laugh. I don't know why... awesome
bam!
Great style of teaching & also thank you so much for such a great video (Note : I have bought your book "The StatQuest illustrated guide to machine learning") 😃
Thank you so much for supporting StatQuest!
Awesome as always
Thanks again! :)
Love your channel
Thanks!
Best video i have ever seen
:)
Great video!
Thanks!
In Stats Playlist, we used following notation for P( Data | Model ) for probability & L(Model | Data) for likelihood;
Here we are writing likelihood as L(popcorn=20 | Loves) which I guess L( Data | Model );
Unfortunately the notation is somewhat flexible and inconsistent - not just in my videos, but in the the field in general. The important thing is to know that likelihoods are always the y-axis values, and probabilities are the areas.
@@statquest understood; somewhere in the playlist you mentioned that likelihood is relative probability; and I guess this neatly summaries how likelihood and probability
I just had the exact same question when I started writing the expression in my notebook. I am more acquainted with the L(Model | Data) notation.
Super awesome, thank you. Useful for my Intro to Artificial Intelligence course.
Glad it was helpful!
can't wait for your channel to BAAM! going worldwide!!
Me too!!
Thanks for the awesome video..
You bet!
These videos are extremely valuable, thank you for sharing them. I feel that they really help to illuminate the material.
Quick question though: where do you get the different probabilities, like for popcorn, soda pop, and candy? How do we calculate those in this context? Do you use the soda a person drinks and divide it by the total soda, and same with popcorn, and candy?
What time point are you asking about (in minutes and seconds). The only probabilities we use in this video are if someone loves or doesn't love troll 2. Everything else is a likelihood, which is just a y-axis coordinate.
Your videos are really great !! my prof made it way harder!!
Thanks!
awesome stuff for real
Thank you!
Great so much Thanks!
You're welcome!
thank you for ur service T.T
Thanks dude
:)
Great stuff : )
Thanks!
Amazing video! Thank you so much!
One question, What if distribution of candy or other feature does not follow normal distribution?
Just use whatever distribution is appropriate, and you can mix and match distributions for different variables.
Thanks for this super clear explanation. Why would we prefer this method for classification over a gradient boosting algorithm? When we have too few samples?
With relatively small datasets it's simple and fast and super lightweight.
Thanks for the awesome explanation. But I've a question. Is GNB can be used for sentiment analysis?
Presumably you could use GNB, but I also know that normal NB (aka multinomial naive bayes) is used for sentiment analysis.
Hi, Josh. Thanks for this clear explanation. Since this Naive Bayes could be applied to Gaussian distribution, I guess it could also be applied to other distributions like Poisson distribution, right? Then a question is: how to determine the distribution of a feature? I believe this will be quite important to build a reasonable model.
Thanks again for the nice video.
One day (hopefully not too long from now), I'm going to cover the different distributions, and that should help people decide which distributions to use with their data.
Thank you josh your videos are amazing! HoW to buy study guides from statquest
See: statquest.gumroad.com/
dude you are awesome
Thank you!
Hey JOSH Thanks for making such amazing video. Keep up the work. I just have a quick question if you don't mind.
I can't understand how you got the likelihood eg: L(soda = 500 | LOVES) how you calculating that value.
We plugged the mean and standard deviation of soda pot for people that loved Troll2 into the equation for a normal curve and then determined the y-axis coordinate when the x-axis value = 500.
Looks like I have to check out the quests before getting to this one😂
:)
Excellent explanation. Any NLP series coming up ? Struggling to find good resources.
I'm working on Neural Networks right now.
@@statquest it's going to be BAM!!
+5000 for using an example as obscure and as obscene as Troll 2.
:)
Hi - another great explanation!
I wonder what would be the result if you normalise the probabilies of the 3 values.
- Would it affect the outcome of the example in this video?
- Which areas of values are affected: different outcomes with non-normalised and normalised distributions (=probability or likelihood here)?
Interesting questions! You should try it out and see what you get.
@@statquest Hi, that only make sense with real data. Without that, only juggling with equations and abstract parameters, the thing is not enough 'visual', IMO. Though, could run through the calculations with e.g. 2x scale, 10x scale and 100x scale... Maybe, when I have free few hours.
Hello! Does it matter if the data in one of the columns (say popcorn) is not normally distributed? Or should the assumption be that we will have a large enough sample size to use the central limit theorem?
Thanks for all of your videos! I love them and can’t wait for your book to be delivered (just ordered it yesterday).
It doesn't matter how the data are distributed. As long as we can calculate the likelihoods, we are good to go. BAM! :) And thank you so much for supporting StatQuest!!! TRIPLE BAM!!! :)
Troll 2 is an awesome classic, and should not be up for debate. =)
Ha! :)
Looks like it also works when both multinomial and gaussian predictor existing in the prediction dataset.
Yes, you are correct. And thanks for supporting StatQuest!
Thnx sir 😊
:)
I'm a simple man, I watch statquests in the nights, leave a like and go chat about it with chatgpt.That's it.
bam! :)
thanks for creating this helpful video! is your sample data available somewhere? would love to calculate things by hand for practice!
Thanks! Unfortunately, the raw data is not available :(
Great work ! In 8:11 How can we use cross validation with Gaussian Naive Bayes? I have watched the Cross validation video but I still can't figure out how to employ cross validation to know that candy can make the best classification.
to apply cross validation, we divide the training data into different groups - then we use all of the groups, minus 1, to create a gaussian naive bayes model. Then we use that model to make predictions based on the last group. Then we repeat, each time using a different group to test the model.
Hi, Josh. Troll 2 is a good movie... Thanks
bam!
Can you talk about Kernel estimation in the future?? Bam!
I will consider it.
A nice video on Gaussian Naive Bayes Classification model. Well done! But I have a quick question for you, Josh. I only understand that Lim ln(x) as x approaches o is negative infinity. How is the Natural log of a really small unknown number very close to zero assumed to be equal to -115 and -33.6 as in the case of L(candy=25|Love Troll 2) and L(popcorn=20|does not Love Troll 2) respectively? What measure was used to determine these values?
log(1.1*10^-50) = -115 and log(2.5*10^-15) = -33.6
Thanku bam🔥🔥
:)
Thanks for these great videos! Quick question: In other resources the likelihood is actually the probability of the data given the hypothesis rather than the likelihood of the data given the hypothesis. Which one would be correct, or is it fine to use either?
Generally speaking, you use the likelihoods of the data. However, we can normalize them to be probabilities. This does not offer an advantages and takes longer to do, so people usually omit that step and just use the likelihoods.
You always get the like after the intro song hahaha
Bam! Thank you very much! :)
Hi
Your video is amazing!!! I have a quick question. When you said to use the cross validation to help use decide which thins, pop corn, soda pop and candy, I think the training data part can "only" help decide the prior probability, then use the testing data to do the confusion test comparisons, all the above are conditioned in each scenario, right? For example, we will have three confusion matrix from pop corn, soda pop and candy based on the test data. What do you think?
That sounds about right.
Could you please make a video on Time Series Analysis (Arima model)?
One day I'll do that.
I love you bro !
Thanks!
The shameless self promotion got me lol, u're so funny
Thanks! BAM! :)