My dude, I don't often need your teachings, but when I do you are able to single-handedly overshadow most of my past professors. I've watched in the past 4 years a good chunk of your videos and you didn't do a single one in which I didn't add some new view, even if small, on the topic. Keep it up with the work.
Thank you so much, I'm a scientist myself and have used some mcmc package blindly. Now, applying what I have been doing to every step of this video made me understand the full concept super clearly.
I gotta say your videos have been super helpful for a stats subject I took last semester (which involved time series, ARIMA model, stationarity etc.) and now MCMC came out at the perfect timing. You have such a gift for explaining the intuition behind statistical concepts, and I'm looking forward to future videos from you. Your channel is a treasure!
Hi Ritvik, your explanations are great in many ways. One of the best things is they are very logically coherent, leaving no gaps that require the listener to figure out. Please do keep up the splendid work. This is a major good deed for so many.
Really excellent series of videos - been scratching my head over sampling methods for ages, but you explain it so succinctly and clearly it is finally making sense. Thanks for these!
Thanks for making this video. Finally came across the one that explain MCMC in plain words without dumping math formulas. Hope other videos and articles in follow this.
At the very end it took me a second watch to realize that of course the sum of all probabilities for x given y would be 1 and thus you would get p(y) on the right hand (so obvious when you type it out :') ). Once again a great video. I think you really hit a sweet spot where people with basic math skills, can benefit from your succinct yet in depth explanations.
a complement about why detailed balanced condition is valid if a distribution is stationary, it's because of bayesian statistics. recall the equation P(a|b) = P(b|a)p(a)/p(b), some rearrangement we get: p(b)P(a|b) = p(a)P(b|a) if it's in stationary, p(a) and p(b) are const, then the equation holds, we call it detailed balanced conditon.
I expect by watching this video, the percent successful uptake of this material for me is so much better than any textbook alone. YT and presenters like ritvikmath is the way to learn new STEM stuff for sure. Much faster and easier, this way. It's like when they finally translated the Bible from Latin to English, and now I'm not needing to suffer with the Latin version any more. haha
Very nice way of introducing the topic. It might be worth pointing out that the detailed balance equations are a sufficient condition for stationarity (reversible chain), but not a necessary condition.
REQUEST: Please organize this playlist in sequential / logical order. Example: The first video of this playlist is Markov Chains (MCMC) which refers to a previous video for accept-reject sampling; but that video is 13th in this playlist. So it's like watching random stuff here.
fantastic. are you just going through chris bishops book and making videos to help us out? i'm reading it atm and keep finding content on your channel. it really is quite helpful in providing intuition for a very dense subject
I'm speechless; your presenting style and explanatory power is insane!!! Thank you so much, I'm just getting into this stuff and the reading is tricky Liked, subbed, etc. 👍👌😁
I wish Ian Goodfellow's book explained MCMC like you do. And I wish my professors back in university can teach and give intuition like this video. I would have been much more interested in stats and data science if it was taught properly.
So the Monte Carlo part refers to the eventual sampling from the stationary Markov Chain? I kind of missed where it comes in, except for the board title.
The Monte Carlo part refers to simulating steps through the Markov Chain. So we design a Markov Chain with some transition probabilities and then we start at some x0 and step from one state to the next which is the Monte Carlo part.
Thank you so much for this video. This is really helpful for my undergraduate research work. One thing I'm finding difficult to understand is, why do we use "thinning" in MCMC ? From what I have read so far, it aims to reduce autocorrelation - but why? Please tell me your thoughts on this problem. I appreciate it a lot. TIA
I get a philosophy from here. The objective is actually is to design the appropriate transition probability. It's like to build work out and healthy eating habit if you want a body goals.
At 6:55 you say "The probability that x_B is any of these x's on this line is exactly the probability p(x)." What does this mean? It sounds like you're saying that for any number x on the line, the probability that x_B = x is p(x). But the possible values of the Markov chain form a countable set, so for any x that's not in this countable set (which is almost all points on the line) x doesn't equal any x_B. I think by "any of these x's on this line" you mean just the x values that occur in the Markov chain.
How do we know the p(x) that should be the steady state of our MC? because I think the p(x) is the black box that we do not know and wants to sample from it to find it. If we have p(x), what is the obstacle against us that prevent us from sampling from it? This is a little bit confusing for me in all sampling videos on RUclips.
How exactly should the end of the burn in be detected and decided by an iterative algorithm, when it's a random variable that is being monitored, and it is therefore jumping around (so you can't see if it goes flat compared to prior values) and you don't even have the truth value to compare with, because otherwise you'd already have your goal in hand at the very beginning?
To be clear the following comment is in no way a criticism; rather it's a line of thinking as to illuminating how I can use this tool on some project. Can you also demonstrate a powerful application or two of this powerful method, on real data from a business, institution, or science dataset? So then is this machinery intended for making better simulations? Such as...? Compared against baseline case that does not use it, how much better is the answer to the problem? Accordingly, in this vein, some excellent looking software frameworks to help use MCMC were recently very well described by Kapil Sachdeva also on RUclips, particularly PYMC3, Stan, NumPyro, and TFProb. (Sorry for YT, but I expect YT will interfere with it if I provide a URL in this comment linking directly to Kapil Sachdeva.)
@ritvikmath by any chance would you happen to have some notes presenting the topic in more depth? I have a general idea of the method but having trouble wrapping my head around some methods presented in papers. If not, its okay!
Great video! Really liked the high-level explanation to get us comfortable with the ideas behind these methods. Quick question: I'm assuming we don't know p(x), so how do we construct a stationary distribution about p(x)?
Thank you, you are always the best. I am working on Bayesian network structure learning using Gibbs sampling, Could you suggest the best book or video which will help me to go through this please. Thank you.
Loved your explanation but can you please organise the videos I need to see serially before watching the "Markov Chain Monte Carlo (MCMC) : Data Science Concepts" video. All the videos are scattered all over the place.
Really great video. A quick question though, what if I want to approximate f(x)? Currently I am using a form of MCMC to do this to estimate the state probability of n samples.
One of the hypothesis of "rejection sampling" is that samples must be independent. But here there, in MCMC, they are not independent. I can't understand why this is still acceptable.
Also, could you maybe make a video on where in Data Science sampling techniques like MCMC (Gibbs, Metropolis ...) are useful? Missing data imputation? Would be highly appreciated!
My dude, I don't often need your teachings, but when I do you are able to single-handedly overshadow most of my past professors.
I've watched in the past 4 years a good chunk of your videos and you didn't do a single one in which I didn't add some new view, even if small, on the topic.
Keep it up with the work.
I had two different university professors explaining MCMC, but I didn't quite get them until watching your video! Best explanation ever!
I don't know what it is, but i really like this guy. Clearly knows his stuff and articulate too. Great presentation, thank you
Thank you so much, I'm a scientist myself and have used some mcmc package blindly. Now, applying what I have been doing to every step of this video made me understand the full concept super clearly.
Fantastic! Note the lack of cuts and edits - this guy knows his stuff.
You have a gift for explaining things. Every question that pops into my head gets immediately answered.
Thanks!
I have been reading a 37 pages paper without understand a thing for two hours, and you've been clear in 12 mins¡¡¡ amazing job, many thanks
I gotta say your videos have been super helpful for a stats subject I took last semester (which involved time series, ARIMA model, stationarity etc.) and now MCMC came out at the perfect timing. You have such a gift for explaining the intuition behind statistical concepts, and I'm looking forward to future videos from you. Your channel is a treasure!
Glad I could help!
Does anyone have a python code that uses MCMC to predict closing prices? Can I have it, thanks
Hi Ritvik, your explanations are great in many ways. One of the best things is they are very logically coherent, leaving no gaps that require the listener to figure out. Please do keep up the splendid work. This is a major good deed for so many.
Thanks a ton!
Exactly. Was about to write the same thing!
I'm very impressed to how clear the explanation is.
Really excellent series of videos - been scratching my head over sampling methods for ages, but you explain it so succinctly and clearly it is finally making sense. Thanks for these!
Glad to help!
Thanks for making this video. Finally came across the one that explain MCMC in plain words without dumping math formulas. Hope other videos and articles in follow this.
Your channel is so underrated, you are making absolutely sick content!
wow!! The continuity in the explanation is just phenomenal , thanks a ton!
finally someone explained why we need markov chain. thank you!
You're an awesome professor. I have finally understood MCMC and Metropolis Hastings thanks to you
You, Sir, are a brilliant instructor...I am awed. Thank you!
I've watch a load of your videos in the last 4 or 5 days.
They are absolutely brilliant!!
Without your video, I think I will never understand the key idea behind MCMC ! Thanks for the good work...
At the very end it took me a second watch to realize that of course the sum of all probabilities for x given y would be 1 and thus you would get p(y) on the right hand (so obvious when you type it out :') ). Once again a great video. I think you really hit a sweet spot where people with basic math skills, can benefit from your succinct yet in depth explanations.
I found this series on MCMC really helpful for my project! Thank you for your very kind support in giving good content.
Great to hear!
This video has significantly improved my base understanding of MCMC, thank you so much
Dude! That was the clearest explanation of MCMC I've ever heard. Thanks!
One of my favorite guys. Has a great knack for knowing the right balance of intuition and rigor/formal definitions.
The interpretation of this entire series is very helpful to understand these topics. Could you please make a video on Bayesian Regression using MCMC
Great video! Much clearer than anything else I've seen or read about MCMC.
You are a great presenter, it is very easy to follow you, clean logic of how you build up the reasoning step by step, I like it very much, thank you.
a complement about why detailed balanced condition is valid if a distribution is stationary, it's because of bayesian statistics.
recall the equation P(a|b) = P(b|a)p(a)/p(b),
some rearrangement we get: p(b)P(a|b) = p(a)P(b|a)
if it's in stationary, p(a) and p(b) are const, then the equation holds, we call it detailed balanced conditon.
I have never seen such an in-depth explanation of the MCMC! Thanks a lot bro.
Do you have any python code that uses MCMC to predict closing prices? Can I have it, thanks
This is going to be super helpful for a future interview :) Thanks!
I expect by watching this video, the percent successful uptake of this material for me is so much better than any textbook alone. YT and presenters like ritvikmath is the way to learn new STEM stuff for sure. Much faster and easier, this way. It's like when they finally translated the Bible from Latin to English, and now I'm not needing to suffer with the Latin version any more. haha
Hey all; here is the Markov Chain Stationary Distribution Video Link: ruclips.net/video/4sXiCxZDrTU/видео.html
Awesome! Looking forward to more on McMC.
More to come!
Thanks for sharing. I begin to love learning.
Very nice way of introducing the topic. It might be worth pointing out that the detailed balance equations are a sufficient condition for stationarity (reversible chain), but not a necessary condition.
Great stuff. I'll be running through all your videos.
This video just save my day
You're so welcome!
REQUEST: Please organize this playlist in sequential / logical order. Example: The first video of this playlist is Markov Chains (MCMC) which refers to a previous video for accept-reject sampling; but that video is 13th in this playlist. So it's like watching random stuff here.
I like the way you teach. Thanks for these videos.
Thanks, very informative! I really like the way you explain things.
Thanks for explaining beautifully.
I love your videos and you really simplify concepts , my only comment is sometimes I get confused or don’t know applications for the concept
Shit… good stuff! I've just gone through 4 of your videos instead of going to pick up dinner. Bravo sir!
That's a very clear explanation. Thank you bro
fantastic. are you just going through chris bishops book and making videos to help us out? i'm reading it atm and keep finding content on your channel. it really is quite helpful in providing intuition for a very dense subject
Very clear description. Thank you!
You are great teacher! Deep respect!
this is an amazing explanation!
That clears everything, thank you.
Hey your videos are the best!
Yin! Thanks :D
Thank you for making this video! Your explanation is superb and easy to follow. Much appreciated!!
Urging for it more than for a new Netflix series!
Thanks for this, really enjoyed your explination
Fantastic explanation! Now I got all the intuition I need to work through the formulas in our lecture :)
I'm speechless; your presenting style and explanatory power is insane!!! Thank you so much, I'm just getting into this stuff and the reading is tricky
Liked, subbed, etc. 👍👌😁
This video is awesome, thank you!!!
This guy is really fantastic
I wish Ian Goodfellow's book explained MCMC like you do. And I wish my professors back in university can teach and give intuition like this video. I would have been much more interested in stats and data science if it was taught properly.
bro you litterly saving lifes hear thx
So the Monte Carlo part refers to the eventual sampling from the stationary Markov Chain? I kind of missed where it comes in, except for the board title.
The Monte Carlo part refers to simulating steps through the Markov Chain. So we design a Markov Chain with some transition probabilities and then we start at some x0 and step from one state to the next which is the Monte Carlo part.
exceptional content!
Great lectures! Awesome!
Thank you so much for this video. This is really helpful for my undergraduate research work. One thing I'm finding difficult to understand is, why do we use "thinning" in MCMC ? From what I have read so far, it aims to reduce autocorrelation - but why? Please tell me your thoughts on this problem. I appreciate it a lot. TIA
Thank you, this helped me a lot
I get a philosophy from here. The objective is actually is to design the appropriate transition probability. It's like to build work out and healthy eating habit if you want a body goals.
Perfect analogy!
it's fun to stay at the mcmc
Im using this playlist as support material in CS229 in 2025.
Brilliant. One word.
At 6:55 you say "The probability that x_B is any of these x's on this line is exactly the probability p(x)." What does this mean? It sounds like you're saying that for any number x on the line, the probability that x_B = x is p(x). But the possible values of the Markov chain form a countable set, so for any x that's not in this countable set (which is almost all points on the line) x doesn't equal any x_B. I think by "any of these x's on this line" you mean just the x values that occur in the Markov chain.
How do we know the p(x) that should be the steady state of our MC? because I think the p(x) is the black box that we do not know and wants to sample from it to find it. If we have p(x), what is the obstacle against us that prevent us from sampling from it? This is a little bit confusing for me in all sampling videos on RUclips.
Thank you! Very helpful for me.
You're welcome!
Amazing !
I like your concepts. Do you have any reference (books) for citation, if I want to add your formulae in my presentation for reference.
How exactly should the end of the burn in be detected and decided by an iterative algorithm, when it's a random variable that is being monitored, and it is therefore jumping around (so you can't see if it goes flat compared to prior values) and you don't even have the truth value to compare with, because otherwise you'd already have your goal in hand at the very beginning?
To be clear the following comment is in no way a criticism; rather it's a line of thinking as to illuminating how I can use this tool on some project. Can you also demonstrate a powerful application or two of this powerful method, on real data from a business, institution, or science dataset? So then is this machinery intended for making better simulations? Such as...? Compared against baseline case that does not use it, how much better is the answer to the problem? Accordingly, in this vein, some excellent looking software frameworks to help use MCMC were recently very well described by Kapil Sachdeva also on RUclips, particularly PYMC3, Stan, NumPyro, and TFProb. (Sorry for YT, but I expect YT will interfere with it if I provide a URL in this comment linking directly to Kapil Sachdeva.)
Awesome thanks a tonne waiting for further videos on mcmc, could you please do a video on hamiltonian monte carlo too
Great suggestion!
What a great video.
@ritvikmath by any chance would you happen to have some notes presenting the topic in more depth? I have a general idea of the method but having trouble wrapping my head around some methods presented in papers. If not, its okay!
Great video! Really liked the high-level explanation to get us comfortable with the ideas behind these methods. Quick question: I'm assuming we don't know p(x), so how do we construct a stationary distribution about p(x)?
Thank you, you are always the best. I am working on Bayesian network structure learning using Gibbs sampling, Could you suggest the best book or video which will help me to go through this please. Thank you.
Loved your explanation but can you please organise the videos I need to see serially before watching the "Markov Chain Monte Carlo (MCMC) : Data Science Concepts" video. All the videos are scattered all over the place.
Very useful!
Glad you think so!
Really great video. A quick question though, what if I want to approximate f(x)? Currently I am using a form of MCMC to do this to estimate the state probability of n samples.
you are god send!
you're a legend
One of the hypothesis of "rejection sampling" is that samples must be independent. But here there, in MCMC, they are not independent.
I can't understand why this is still acceptable.
I'm just here because there is a gun in Destiny 2 call Monte Carlo, which in turn has a perk called Markov Chain.
I get why it was called that now
Lol
@@ritvikmath I watched the whole video, really well done. While most of it went over my head, the concept was well explained.
could you please make a video on Sequential monte carlo (SMC) and Hamiltonian monte carlo (HMC)
Also, could you maybe make a video on where in Data Science sampling techniques like MCMC (Gibbs, Metropolis ...) are useful? Missing data imputation? Would be highly appreciated!
So here you say that stationary is not to have the same probability, the same number, buy to have the same p(x), which is a distribution, a function?
KING you are KING
Video on Copulas please
Question - where does the first sample come from?
Any chance of doing the EM algorithm?
love the intro
Can you do a lesson on Gaussian Copula, please?
could you please make a viideo on Sequential monte carlo SMC
Excellent pédagogue
Clear. Thank you.
When are you going to do Hamilton MCMC? Its so hard to understand.