This could be by far the best explanation I have seen for EM algorithm. The way you have connected the intuitive way to mathematical explanation, is so so commendable!!!! Thank you so much for your efforts
Thank you for the high-quality contents that you have produced over the past few years. Most of the time, it really did help me get the intuition and understanding of what was going on with the theoretical concepts I was seeing in my courses. Once again, thank you !
Brilliant explanation. I especially appreciate you first providing the intuition of the method in the verbal explanation of the E and M steps. I struggled with the seeing the math first in other lectures until seeing your video. Thanks for posting this.
That's really great way to look at EM. I'm an engineering graduate but new to ML and the workup explanation before dropping into the maths is excellent. thanks
Yes please go on with the prove, that will be an interesting topic. Though I went on Andrew's ng video couple of times, but I couldn't understand it better than here!! You're a rock star in simplifying complex concepts!!
I had a jolt of excitement when I saw you had decided to do a video on this topic. It's something I've had to revisit time and time again, always understanding the intuition, but always getting lost in the formulas. Your post did a great job at helping to explain the intuition. I did struggle a bit with your non-conventional likelihood notation, though. That did throw me off a little bit but I understand why you had to have it that way and quickly adjusted. The care you took in explaining why there is mu and mu0 just shows why you are a fantastic teacher.
Thanks for the very clear explanation! A follow up video on how the EM algorithm can be used in gaussian mixture models or bayesian networks would be awesome!
Awesome explanation. I'd like to extend yours with my intuition regarding the E-Step: the first term p(x|m0) shows the probability of x happening for the chosen m0, and the second term LogLikelihood shows the probability of x happening for the computed m, and we want to maximize both. Because we want a choice with high probability from every aspect. That's why we multiply them together. Because the multiplication can weight between them. If one of them is small then the result will be small. It can be high only if both are high.
Thanks so much for this great and explanation! I would definitely be interested in the proof. It will be great if you could do a video on Gaussian mixture models as well and how it is solved using the EM algorithm.
Thanks a lot! That is a great explanation!!! I was struggling with EM for a long time!! :)) I'd grateful if you also talk about the proof of convergence!
Thanks for your explanation. I think my main mental knot was wondering why you alter N instead of looking for the best guess for x to locate the value of the unknown value. To realize that x doesn't change and the power of the algorithm lies in finding the optimal solution for the learner without caring for the actual value of x was what I needed for it all to make sense.
Thanks So much Ritvik!Your videos are amazing...do you have list of playlist for machine learning to connect dots in ML concepts,I see playlist for data science but not for machine learning. Thanks
Thanks a for the easy-to-understand intuation of EM algorithm. Would you like to explain the Coin-flip example along with your formulation step ② and ③?
The expression for Expectation seems similar to Bayesian theory where we have prior belief (P(x|u)) and likelihood and we are multiplying both to get posterior. Is this the same concept?
Great explanation. However, the way you have written it, there is no difference between the likelihood function and the probability function. I think for clarity you should swap x,1,2 and \mu. Also you should use ; instead of | so that the likelihood function is not confused with conditional probability.
Nice to see the theorem guaranteeing convergence for sequences that are increasing and bounded being used to prove this. I do have a more pragmatic question which is how somebody would go about finding the argmax in the M step. Would gradient descent be used on the expectation of log-likelihood function (I would imagine in this case the expectation of log-likelihood would have to be convex for this to work) to find the argmax?
Yep, you can use any optimization method. For Gaussian mixture models there are explicit formulas for the M step which are obtained in the usual way by setting the gradient of the expected log-likelihood to zero.
Something does not seem right to me in the E-step. I think the likelihood should be written given the latent variable, which is "x" in this case. But you have written it in given mu... I'm confused. Also I don't understand how to solve the M-step. When i write it down in this case i cannot update x at all🤦🤦🤦 I only update mu 🤦🤦 I'm completely confused
Sir. I only know basic statistics. From where should I start watching your videos. Is there any order to them? The concepts you are mentioning I am not familiar with them.
This could be by far the best explanation I have seen for EM algorithm. The way you have connected the intuitive way to mathematical explanation, is so so commendable!!!! Thank you so much for your efforts
Glad it was helpful!
@@ritvikmath I comfirm, thank you for helping lost students
The world needs to see this. Thanks Ritvik, I honestly have utmost respect and love for the amount of hard work you put in your videos. Cheers :)
It would take a lot of time to develop these intuitions on your own.
Thank you for the high-quality contents that you have produced over the past few years. Most of the time, it really did help me get the intuition and understanding of what was going on with the theoretical concepts I was seeing in my courses.
Once again, thank you !
Your channel and way of teaching is so amazing!! Very inviting, inclusive, and friendly. Thank you so much for such good vibes 💕
Brilliant explanation. I especially appreciate you first providing the intuition of the method in the verbal explanation of the E and M steps. I struggled with the seeing the math first in other lectures until seeing your video. Thanks for posting this.
Thank you Ritvik for simplifying EM algorithm like this. This is the best video I have seen so far.
That's really great way to look at EM. I'm an engineering graduate but new to ML and the workup explanation before dropping into the maths is excellent. thanks
Thank you! By far the best channel for providing clear explanations to fairly complex problems.
I have an exam tomorrow and this video was the thing I needed. I can't thank you enough dude.
Your explanations are soooo clear! really appreciate the effort you put into your videos. Thank youu!!
Yes please go on with the prove, that will be an interesting topic. Though I went on Andrew's ng video couple of times, but I couldn't understand it better than here!! You're a rock star in simplifying complex concepts!!
Your videos are unreal, simple explanations of complex problems its insane.
your understanding and explanation of such a complicated concept is impeccable
Incredible explanation! Was trying to understand the intuition behind EM for a long time! Thanks for the video! Keep Going!!
Glad it helped!
Much better explanation than what I normally see. I would also be interested in seeing you go through the proof.
I had a jolt of excitement when I saw you had decided to do a video on this topic. It's something I've had to revisit time and time again, always understanding the intuition, but always getting lost in the formulas. Your post did a great job at helping to explain the intuition. I did struggle a bit with your non-conventional likelihood notation, though. That did throw me off a little bit but I understand why you had to have it that way and quickly adjusted. The care you took in explaining why there is mu and mu0 just shows why you are a fantastic teacher.
Thanks for the very clear explanation! A follow up video on how the EM algorithm can be used in gaussian mixture models or bayesian networks would be awesome!
It's 4am and I saw this video and had to watch....... really great explanation bro.....your a natural teacher.....thanks for this......subscribed
It would take me two more lives to be able to explain it this well to someone, kudos! Great job buddy!
Wow, thanks!
Awesome explanation. I'd like to extend yours with my intuition regarding the E-Step: the first term p(x|m0) shows the probability of x happening for the chosen m0, and the second term LogLikelihood shows the probability of x happening for the computed m, and we want to maximize both. Because we want a choice with high probability from every aspect. That's why we multiply them together. Because the multiplication can weight between them. If one of them is small then the result will be small. It can be high only if both are high.
thanks for the additional inputs!
Thanks!You explained such a complicated subject so clearly!!!!
thanks!
By far the best explanation, amazing.
Holy, i can't believe how good this video was :) thank you so much
YES! I have quiz on this NEXT WEEK!
cant express how happy i m to see after yr videos . thanks a lot !
you are just amazing! What would be super useful would be an EM video based on your "Maximum likelihood" one.
Thank you for all the work you put in your videos to make life's like mine easier. Cheers man!
Thanks so much for this great and explanation! I would definitely be interested in the proof. It will be great if you could do a video on Gaussian mixture models as well and how it is solved using the EM algorithm.
Although there is more for fully understanding, I was able to gain the concept because of your video!
Broke down the most complicated algorithm in the simplest terms. Wow!
Interesting way of looking at EM problem. Thank you.
Thanks a lot! That is a great explanation!!! I was struggling with EM for a long time!! :))
I'd grateful if you also talk about the proof of convergence!
Excellent explanations!
i thank GOD i found your channel. A big thanks to youtube and to you!!
thank you ritvik the best videos are in this channel.
Very intresting way of teaching thank you from TUNISIA
Most welcome!
Ngl my favorite rapper-turned-algorithm
Thanks for your explanation.
I think my main mental knot was wondering why you alter N instead of looking for the best guess for x to locate the value of the unknown value. To realize that x doesn't change and the power of the algorithm lies in finding the optimal solution for the learner without caring for the actual value of x was what I needed for it all to make sense.
this helped so much, thank you a lot!!
Excellent explanation!!
Lovely, that's very intuitive. Thank you so much.
Amazing explanation!
Great video !
Ritvik, you are doing a great job, thanks
Would love to see a proof video! Keep up the great work!
Very compelling ... Brilliant
Thank you so much for your explanation, helps me a lot
Very well explained
What an amazing channel, honestly
Got this crystal clear. Thanks a lot!
Thankyou for explaining very clearly
So clear -- wow!
The only explanation you need for understanding EM algorithm, proper chad explanation!
Really nice explaination! Thank you!
Glad it was helpful!
Awesome! Best explanation of EC algorithm for the beginner!
Glad it was helpful!
Excellent. Thank you so much! 👍
thanks!
Can't wait to see the proof
amazing, thanks for such a clear explanation :)
Great teacher❤
Great video! Thank you so much
Glad it was helpful!
This explanation is amazing in order to get the concept
Amazing, thank you for that !
Glad you liked it!
The best Thanks man
really great explanation! thank you :-)
Absolutely fantastic. I agree w/ other comments... The DS world needs to see this. Thank you.
Glad you enjoyed it!
THANK YOU. You're literally saving my ML undergraduate course
awesome!
Thank you so much BRO
This is explained so well
great man, ultra great
Thank you Ritvik for your explanation! does it work only for normal distribution or we can apply it for other kind of distributions
Great videos. Got it in one go! Could you do Gaussian Mixture Models? Thanks.
I'm interested in the proof!
Thanks So much Ritvik!Your videos are amazing...do you have list of playlist for machine learning to connect dots in ML concepts,I see playlist for data science but not for machine learning.
Thanks
Thanks for the great lecture. One question if I may: 2:20, why you put best guess 1 here instead of a random draw from your known distribution?
Thank you so much for these videos!
One question: how do you estimate and maximize the integral in practice? That was the elephant for me...
Can you do the proof too please
Thank you so much for these amazing vids.
Would you kindly provide any MATLAB codes that illustrate these concepts?
oh my god. this was so helpful
Awesome!
Thanks a for the easy-to-understand intuation of EM algorithm. Would you like to explain the Coin-flip example along with your formulation step ② and ③?
thank you !!!
Amazing.
Thank you! Cheers!
OH MY GOD THANK YOU
A worked example of the final process would be invaluable.
On step 2, what does the dx do at the end of that equation?
Thanks for the video! What was not clear to me is whether we calculate all E(LL|M) for all Ms in which we can calculate the argman in step 3?
Hi Ritvik, thank you very much for awesome videos. Could you please make some videos on SQL?
thanks! and please check out my full SQL playlist here:
ruclips.net/p/PLvcbYUQ5t0UFAZGthysGOAtZqLl3otZ-k
@@ritvikmath Awesome! Thanks a lot.. Could you please add sql with window function to the playlist, if possible?
Is the EM algorithm the best algorithm to use in some specific problem (compared for example to the gradient descent algorithm)?
Example with python coming anytime?
Coild you please do the derivation or intuition for EM for clustering? I observe that it is described in many textbooks, but not in such a cool way. 😅
Please make a proof video
Can you explain EM algorithm in terms of compositional data please?
The expression for Expectation seems similar to Bayesian theory where we have prior belief (P(x|u)) and likelihood and we are multiplying both to get posterior. Is this the same concept?
Thanks for the great video! One question: if you have (1+2+x)/3 = x , then you can have close form solution, why you still need numerical approach?
What if x is high dimensional? How would the integral change?
How would the problem change if we didn't know the variance either?
Great explanation. However, the way you have written it, there is no difference between the likelihood function and the probability function. I think for clarity you should swap x,1,2 and \mu. Also you should use ; instead of | so that the likelihood function is not confused with conditional probability.
Sir , I know, In E - step we estimate unknown x , but you are calculating Likelyhood . how are these connected ?
Nice to see the theorem guaranteeing convergence for sequences that are increasing and bounded being used to prove this. I do have a more pragmatic question which is how somebody would go about finding the argmax in the M step. Would gradient descent be used on the expectation of log-likelihood function (I would imagine in this case the expectation of log-likelihood would have to be convex for this to work) to find the argmax?
Yep, you can use any optimization method. For Gaussian mixture models there are explicit formulas for the M step which are obtained in the usual way by setting the gradient of the expected log-likelihood to zero.
why x = -1 and not equal 0 im kind confused for his first guess on the first question
Something does not seem right to me in the E-step.
I think the likelihood should be written given the latent variable, which is "x" in this case. But you have written it in given mu...
I'm confused.
Also I don't understand how to solve the M-step.
When i write it down in this case i cannot update x at all🤦🤦🤦
I only update mu 🤦🤦
I'm completely confused
Sir. I only know basic statistics. From where should I start watching your videos. Is there any order to them? The concepts you are mentioning I am not familiar with them.