I got my PhD back in 2000s. I wish material like this existed back then. I struggled so much with a lot of basic concepts. I had to look at material written in the most extreme mathematical notation without gaining any intuition. You have done an amazing job explaining something I really had hard time grasping. Well done.
Thank you for making these videos! I am starting my master's thesis on Markov Chains and these videos help me get a solid introduction to the material. Much easier to understand the books now!
Thanks for videos. I am majoring in Data Science and obviously videos like this sometimes help enormously comparing to reading texts. Very Intuitive and visual. I don't think I will forget the weather signs you showed us today.
Dude your videos are severly underappreciated. The animations, basic but complete style of you used when talking about abstract and complex topics. I just discovered this channel, and I will recommend all the people including beginners because even they will be able to understand as well as advanced people.
Many thanks for the beautiful visualization and summarization of the Markov model. It was effortless understanding it. May require a little revision but comprehensible with ease. 🙂
Great! Just Great! I really dont understand why most professors at colleges hate this type of explaining things. They always choose standard "lets be super formal and use super formal mathematical notation". Yes, it is important to learn formal mathematical things, but why not combine both formal and informal approach and put them together in textbook?
Hey, thanks a lot for making these! One suggestion if you don't mind: you could avoid using red and green (especially those particular shades you used) as contrasting colors, given that they're close to indistinguishable to about 8% of males. Basically any other combination is easier to tell apart, e.g. either of those colors with blue. Just a minor quibble, the videos are otherwise very good!
Thank you for the clarity of explanation. Why did you neglect the denominator P(Y). How can we calculate it ? I assume that the correct arg max should take into consideration the denominator P(Y)
we are taking argmax of a function with X as the variables, so Y doesnt matter because of argmax...you can refer to bayes theorem for maximum likelihood, they always do the same thing
We want to find the X_1, ..., X_n that gives us the maximum value. Note that P(Y) does not depend on the Xs and is therefore a constant. A constant does not change the Xs that give us the maximum.
Thank you for the vid, it is probably the arg max (understandable) available around. But something remains unclear for me. You said at 09:30 that one Markov property is that X_i "depends only of X_i-1 but Markov property I know is the opposit: X_i is independent of X_i-1 (future does not depends on past, just on the current state). Where I am missing the point?
Because of two reasons... 1. It's often hard to compute P(Y) 2. To maximize that expression we only need to maximize the numerator (depends on X). Note that P(Y) doesn't depend on X.
@@NormalizedNerd 1. can't we just compute P(Y) as P(X1,Y) x P(X1) + P(X2,Y) x P(X2) + P(X3,Y) x P(X3) ? 2. true, I agree. Since you didn't say it in the video I was just wondering where did P(Y) disappear, and didn't bother to think that the max was actually over X
Beautiful! Thank you. Question: in the final formula, "arg max(over X) Prod P(Yi | Xi) P(Xi | Xi-1)" ... We have a product term P(X1 | X0) the assumes there is an X0 value. However, there is no X0. Don't we need to replace this term with a different expression that does not rely on X0?
this was the perfect video for where I'm currently at. I learned about Markov chains last year, and just finally got a good grasp on Bayes' (I struggled through Prob & Stats years ago). Thanks so much! keep it up!
Thanks a lot! These type of videos are amazing and helps me understand the concepts in good way that are there in the books. It boosts my interest to this area. It helps me a lot in doing my project! U make these kind of videos for almost all concepts!😍
Thanks for the explanation! You went through the math of how to simplify \argmax P(X=X_1,\cdots,X_n\mid Y=Y_1,\cdots,Y_n) but how do you actually compute the argmax once you've done the simplification? There must be a better way than to brute force search through all combinations of values for X_1,\cdots,X_n right?
HI.. thanks for wonderful videos on Markov Chain. I just want to know how do you define the probability of transition state and emission state? What to do about unknown probabilities of state?. Regards
Dear sir, Your explanation was very well explained and understandable. It is full of mathematics coming here matrix, probability and so. I'm from a science background without maths. I needed this for bioinformatics but it is difficult to compare with nitrogenous bases with these matrix and formulae. Will you explain it in simpler method? It would be very helpful sir 🥺
can you explain the calculation of P(X|Y), the last step of the video when you put inside the products of P(Y|X) and P(X|X). Where does the P(Y) in the denominator go? Thanks
Howdy! I think of it like this: P(Y) is a constant, and doesn't affect which sequence X has the highest probability of occurring. In other words: since every term gets divided by P(Y), we can just ignore it. Perhaps he could've made that a little clearer in the video. Cheers!
I got a little confused with the two HMM videos. I thought the second video would solve the argmax expression presented at the end of the first one, but the algorithm that solves this expression is the Viterbi algorithm and not the Forward algorithm from the second video. Just a heads-up to those that got a little lost like me.
Hi Normalised Nerd, and Everyone Reading This, I have a question please. Can I use HMM the other way round: find the mood sequence given the weather sequence? How?
I've been going through these videos and doing the calculations by hand to make sure I understand the math correctly. When I tried to calculate the left eigenvector ([0.218, 0.273, 0.509]) using the method described in the first video (Mark Chains Clearly Explained Part 1), I got a different result ([0.168, 0.273, 0.559]), and I'm wondering if I missed a step. Here's what I did: starting with [0 0 1] meaning it is sunny, pi0A = [0.0 0.3 0.7], pi1A = [0.12 0.27 0.61], pi2A = [0.168, 0.273, 0.559]. It's interesting the second element matches. If anyone might help me understand where I went wrong, I'd greatly appreciate it!
@@sushobhitrathore2555 - lol why 010 ..sunny is at last position. And even if we go through your way we don't get the right result. Result of pi2A as per you funda is [0.252, 0.272, 0.476]
I got my PhD back in 2000s. I wish material like this existed back then. I struggled so much with a lot of basic concepts. I had to look at material written in the most extreme mathematical notation without gaining any intuition. You have done an amazing job explaining something I really had hard time grasping. Well done.
yeah phd level is almost all time in use or link to some machine learnning theorie understanding i doin it now
I thought exactly the same!
En español no hay contenido así, tenemos que recurrir a videos en inglés
because of people like you we alll are able to go deep into artificial intelligence and able to simplify what we are watching right now
Great teaching Nerd. Where are You based now? ,I am a doctor from INDIA need your academic help
I am 60 by age. I am learning how to simplify a complex subject. You are a great teacher! God bless you
Thanks a lot sir 🙏
I am very flattered.
didnt ask
@@BhargavSripada and we certainly didn't ask for your opinion
This might be the fastest I've gone from never having used a concept to totally grokking it. Very well explained.
Thank you for making these videos! I am starting my master's thesis on Markov Chains and these videos help me get a solid introduction to the material. Much easier to understand the books now!
That is great. How is it so far? I want to apply it to predict system reliability. Not sure how complicated it is. Could you advise please?
Better explained than over 3 AI university classes I have gone through. Simple, efficient. Thank you
This series of videos has been incredibly useful for me to write my Masters thesis. Thank you so much.
Very rare topic explained in most precise manner. Very helpful 👍
Thanks for videos. I am majoring in Data Science and obviously videos like this sometimes help enormously comparing to reading texts. Very Intuitive and visual. I don't think I will forget the weather signs you showed us today.
That's the goal 😉
@@NormalizedNerdCan you briefly explain the steps involved in finding the probability of sunny day..I really don't understand
After going through your Markov chains series, you my friend got yourself a new subscriber! Great work. Your channel deserves to grow!
I'm just going through the markov chain playlist and because of its quality I'm gonna subscribe to this channel. Great material!
Dude your videos are severly underappreciated. The animations, basic but complete style of you used when talking about abstract and complex topics. I just discovered this channel, and I will recommend all the people including beginners because even they will be able to understand as well as advanced people.
Thanks a lot for appreciating!!
What a channel. Have never came across any Data Science channel like yours. You are doing a fantastic work. Love your videos and going thru them ❤
Amazing job. This really helps if you are preparing for interviews, and want a quick revision. Thank you for doing this.
You're very welcome!
Loved it! I am looking forward to (maybe) seeing a video on the Markov Chain Monte Carlo (MCMC) algorithms. Best regards!
Many thanks for the beautiful visualization and summarization of the Markov model. It was effortless understanding it. May require a little revision but comprehensible with ease. 🙂
Thank you, really appreciate your work. Watched this video let me consider that my professor in school class is not a good teacher.
Great! Just Great! I really dont understand why most professors at colleges hate this type of explaining things. They always choose standard "lets be super formal and use super formal mathematical notation". Yes, it is important to learn formal mathematical things, but why not combine both formal and informal approach and put them together in textbook?
Thanks!
Excellent! Skipped 4th part of this Magnificient Markov series. Took roughly 3hrs to verify at moments and coninvce myself. HIT MOVIE!!
A really well laid out video. Looking forward to watching more
Thanks! Keep supporting :D
Don't stop what you are doing! It's amazing.
Thanks!! :D
From the accent u r a bengali but r u from isi? GREAT vdo,keep going
@@asaha9479 Yup I'm a bong...But I don't study in ISI.
Hey, thanks a lot for making these!
One suggestion if you don't mind: you could avoid using red and green (especially those particular shades you used) as contrasting colors, given that they're close to indistinguishable to about 8% of males. Basically any other combination is easier to tell apart, e.g. either of those colors with blue.
Just a minor quibble, the videos are otherwise very good!
Thanks a lot for pointing this out. Definitely will keep this in mind.
Hello author from the past! Your video is really helpful! Thank you!
Super entertaining videos helping me with my Oxford master's thesis. Study night or movie night? Plus he has an awesome accent :-)
Thanks a lot mate! :D :D
Thank you for the clarity of explanation. Why did you neglect the denominator P(Y). How can we calculate it ? I assume that the correct arg max should take into consideration the denominator P(Y)
we are taking argmax of a function with X as the variables, so Y doesnt matter because of argmax...you can refer to bayes theorem for maximum likelihood, they always do the same thing
You explain things nicely. I would request you to make videos on advanced stochastic processes like semi-Markov Process, martingale etc.
Amazing!! It really helps me understand the logic behind those scary HMM python codes. Thank you.
Wonderful video. Amazing explanation. Please explain why P(Y) is neglected? Or is considered as 1?
arg max is computed by varying X, so we can neglect P(Y) because it's not varying on each iteration and will not change the final result
Awesome explanation ! Loved the way you explained Math used for calculations !
Thanks a lot! :)
nice video ! but what happen to P(Y) at 8:30 in the final formula, why does it disappear ?
We want to find the X_1, ..., X_n that gives us the maximum value. Note that P(Y) does not depend on the Xs and is therefore a constant. A constant does not change the Xs that give us the maximum.
superb explanations ! that shows how in depth your knowledge is !
Simply wonderful. Keep up your excellent work. Really really well done!
Very helpful video as well as the rest of the Markov series. Wish you luck from Vietnam!
Thanks a lot mate :)
Thank you for the vid, it is probably the arg max (understandable) available around. But something remains unclear for me. You said at 09:30 that one Markov property is that X_i "depends only of X_i-1 but Markov property I know is the opposit: X_i is independent of X_i-1 (future does not depends on past, just on the current state). Where I am missing the point?
Really cool explanation! Can you also explain why is P(Y) ignored?
Because of two reasons...
1. It's often hard to compute P(Y)
2. To maximize that expression we only need to maximize the numerator (depends on X). Note that P(Y) doesn't depend on X.
@@NormalizedNerd Thank you! I had the same concern and your explanation makes sense!
@@NormalizedNerd 1. can't we just compute P(Y) as P(X1,Y) x P(X1) + P(X2,Y) x P(X2) + P(X3,Y) x P(X3) ?
2. true, I agree. Since you didn't say it in the video I was just wondering where did P(Y) disappear, and didn't bother to think that the max was actually over X
Oh look, a coffee button! Thanks!
Thanks so much!!
@@NormalizedNerd no, thank you sir. This has been very informative and accessible. I will be checking out your other content in future!
Beautiful! Thank you.
Question: in the final formula, "arg max(over X) Prod P(Yi | Xi) P(Xi | Xi-1)" ...
We have a product term P(X1 | X0) the assumes there is an X0 value. However, there is no X0. Don't we need to replace this term with a different expression that does not rely on X0?
hiii please i just want to know the tool you create those exemples with , its urgent save me
Very nice and clear. Thank you.
this was the perfect video for where I'm currently at. I learned about Markov chains last year, and just finally got a good grasp on Bayes' (I struggled through Prob & Stats years ago). Thanks so much! keep it up!
That's amazing! :D
Great videos, keep it up! :)
I would be nice to have a video about MCMC (Markov Chains via Monte Carlo) and the Metropolis-Hastings algorithm
Great suggestion.
@@NormalizedNerd I'm on the edge of subscribing. A video on MCMC would convince me to subscribe and never leave!
@@nonconsensualopinion 😂😂...Let's see what comes next
Very well explained ! 👏👏
Thanks a lot! These type of videos are amazing and helps me understand the concepts in good way that are there in the books. It boosts my interest to this area. It helps me a lot in doing my project! U make these kind of videos for almost all concepts!😍
Great explanation, super helpful
Thanks for the explanation! You went through the math of how to simplify \argmax P(X=X_1,\cdots,X_n\mid Y=Y_1,\cdots,Y_n) but how do you actually compute the argmax once you've done the simplification? There must be a better way than to brute force search through all combinations of values for X_1,\cdots,X_n right?
Thanks a ton, I wish my professors from Monash Uni taught this way.
I really like the visuals and your presentation with the content. Good work!
Glad you like them!
Explained so simply. Well done. Helped me a lot.
Great to hear!
In 8:25, P(X) does not look right to me as we should have P(X1). Pi_i=2 to n P(X_i | X_i-1)
please keep updating~ you are doing an amazing job~~.
Sure I will
HI.. thanks for wonderful videos on Markov Chain. I just want to know how do you define the probability of transition state and emission state? What to do about unknown probabilities of state?. Regards
Dear sir,
Your explanation was very well explained and understandable. It is full of mathematics coming here matrix, probability and so. I'm from a science background without maths. I needed this for bioinformatics but it is difficult to compare with nitrogenous bases with these matrix and formulae. Will you explain it in simpler method? It would be very helpful sir 🥺
"Hello People From The Future!" that was very thoughtful
Finally, some video helped.
How do we calculate the stationary distribution of the first state? I watched your previous videos but still cant calculate it! Thanks for answering!
Great video! Just a little lost where you get the prob(sad or happy | weather), which I think are emission probabilities? Thanks!
can you explain the calculation of P(X|Y), the last step of the video when you put inside the products of P(Y|X) and P(X|X). Where does the P(Y) in the denominator go? Thanks
Howdy!
I think of it like this: P(Y) is a constant, and doesn't affect which sequence X has the highest probability of occurring. In other words: since every term gets divided by P(Y), we can just ignore it.
Perhaps he could've made that a little clearer in the video.
Cheers!
I got a little confused with the two HMM videos. I thought the second video would solve the argmax expression presented at the end of the first one, but the algorithm that solves this expression is the Viterbi algorithm and not the Forward algorithm from the second video. Just a heads-up to those that got a little lost like me.
This is gold. Thanks for uploading. Very helpful
You're welcome :D
Nice explanation. Thanks.
excellent explanation!
ssuper super good. awesome work bro
You have a knack for teaching! The explanation was clear and the example along with the emojis was so cute! Thank you!!!
HI excellent and very easy way of explaining the complex mathematical terms. Did you make these slides in PowerPoint ? or some other tool ? Regards,
Superb explanation
Is this related to the viterbi algorithm? Could you make a video on that?
Very well explained!
best videos! hope to see more of the videos about markov chain! Thank you
Sure thing!
Where can I find the software you are using for mathematical visualization? Please advice.
It's an open source python library called Manim.
fantastic series! I am just wondering, is there a more efficient way to calculate the maximum than to try all possible permutations? Thanks!
Can someone please explain where the P(Y) in the denominator went when the products were substituted in the Bayes theorem?
Can I use the equation to compute the max join probability distribution or a code is required to do that?
This was good! Thank you.
not me studying for my final in 6 hrs
Same here :)
Bro you are a king
Thank you well explained
thank you for your videos! please continue the great work!
Thanks, will do!
Hi Normalised Nerd, and Everyone Reading This, I have a question please. Can I use HMM the other way round: find the mood sequence given the weather sequence? How?
something that looks so hard, you make it understandable even for a 4 years old kid
Thanks for the video, is it possible to access the python code you did for this problem?
We need more videos on Markov chains
You deserve my subscription thanks a lot!
:D :D
Thanks for the great video. What would be the hidden state sequence data and observed sequence data in speech to text use case?
very nice explanation thank you
You should make more videos, you are awesome
As soon as the emoji's left, the video went over my head
hey you from the past....good video
excellent video!
Amazing video bro! you rock!
Thanks bro!
How do you get the transition probabilities?
please do more examples for hidden markov chains
Great Video, Thanks!
You're welcome!
Thank you for sharing, could you please explain how to implement HMM on measuring the earnings quality. Need your help 🙏🙏
I don't understand how you got the values of stationary distribution. Trying to calculate myself and getting totally different numbers..
Can you post your stationary distribution here?
@@NormalizedNerd (0.429 0.536 1)
I got it. Forgot to scale the values :D. Sorry
this video is good but it left me hanging. I was expecting you to calculate the probability at the end.
I've been going through these videos and doing the calculations by hand to make sure I understand the math correctly. When I tried to calculate the left eigenvector ([0.218, 0.273, 0.509]) using the method described in the first video (Mark Chains Clearly Explained Part 1), I got a different result ([0.168, 0.273, 0.559]), and I'm wondering if I missed a step. Here's what I did: starting with [0 0 1] meaning it is sunny, pi0A = [0.0 0.3 0.7], pi1A = [0.12 0.27 0.61], pi2A = [0.168, 0.273, 0.559]. It's interesting the second element matches. If anyone might help me understand where I went wrong, I'd greatly appreciate it!
you must start the probability of states as [0 1 0].....you will get the same values
@@sushobhitrathore2555 - lol why 010 ..sunny is at last position. And even if we go through your way we don't get the right result. Result of pi2A as per you funda is [0.252, 0.272, 0.476]
Really good explanation ! Thank you!
Glad it was helpful!
Clearly explained.superb
Glad you liked it
hello!!!! please explain conditional random field :( thank you
How did you compute stationary distribution for the question which is not irreducable. The sunny to rainy probability is zero!