Thanks for the video. Just wanna clarify, the likelihood function you mentioned actually should be the likelihood*prior instead. Can be noted as p(x|lambda)*p(lambda). Only first part is the likelihood. 😂
Hi Dr Zhang, Great job in explaining! The best explanation so far. I was just wondering whether there might be an error at 6:34 regard the 'S'. Because S should be P[X] = Sum_to_n( P[ X | lambda = lambda_n ] ) = P(X| lambda_1) + P(X| lambda_2) + .... + P(X| lambda_n) But in the video, S = L(lambda) = Sum_to_n( P[ X | lambda = lambda_n ] * P(lambda_n) = P(X| lambda_1)*P(lambda_1) + P(X| lambda_2)*P(lambda_2) + .... + P(X| lambda_n)*P(lambda_n) Thanks in advance!
It is correct in the Video. Because of bayes, there must be P(X), we use the law of total probability to describe it as the sum of all marginal probabilitys P(X | lambda_n) thats a partition of your entire space. After that you need to know what weight it has in the probability space. In the end it gives us P(X)
The only part I don't understand is in Step 4, where the denominator is the sum of all lambda ns ("S") - shouldn't the denominator be observed samples / data (sum of xs) per Bayes' Theorem?
Overall this was great and helpful. Thanks! I am tripped up though by couple of details. Lambda is a continuous variable between 1 and 6. One thousand values for lambda are captured in the histogram. At 5:35 you show than each of these one thousand values has a probability of one fifth. This is not intuitive or compelling. At the same point on the bottom you have 50 included in the multiplication. Why? At 6:39 what is it about Bayesian models that leads to the probability of lambda n in the posterior distribution being proportional to the likelihood of lambda n? Again not intuitive. And finally why should its proportionality factor be tied to the reciprocal of the sum of the 1000 likelihoods as opposed to anything else? Any pointers or clarifications will be greatly appreciated.
One fifth is the density of lambda when it is uniformly distributed from 1 to 6. dx is a tiny increment of x you use to convert density function to probability. It is eliminated when you calculate the ratio of L(lambda_n) and S. Baysian models can be found from en.wikipedia.org/wiki/Bayes%27_theorem#Extended_form For the last question, the 1000 likelihood can be understood as our existing knowledge of the sum likelihood of all possible lambda. The proportion gives you the probability of each particular lambda. Because it is continuous, dx is eliminated and so you have the pdf.
Exactly the same caught my eye... however 1/6 is not correct either, if lambda ~ U(1,6) then prob(lambda = lambda_n) = 0!!! U(a,b) is a continuous probability distribution after all.
As far as I understand you do not talk about confidence intervals in Bayesian statistics, but credible intervals. Even though the interpretation is similar, they have fundamental differences that shows the different views in probabilities in the frequentist vs Bayesian interpretation. Further I do not understand the point of sampling 1000 values of lambda. Could you not just compute the likelihood for linearly spaced values of lambda between 1 and 6?
It is good and really helpful. If possible, please give example with gamma prior or beta prior with coverage probability. Please tell how we can code same in matlab
I signed up this account in RUclips just to say that this video is so helpful and clear!
It helped a lot to understand statistical inference. Thank you! The contrast against minimum square error did it for me
This the best video on understanding basics of Bayesian Inference
Excellent video!
Thanks for share with us
Greetings from chile
Thanks for the video. Just wanna clarify, the likelihood function you mentioned actually should be the likelihood*prior instead. Can be noted as p(x|lambda)*p(lambda). Only first part is the likelihood. 😂
Ty man, missed class due to illness and this cleared up a lot 🙏
thank you, your lession very helpful and interesting!!!
Great explainations
Thanks, it's very intuitive!
Thanks a lot!
You are a godsend. Thanks so much!
Hi Dr Zhang, Great job in explaining! The best explanation so far.
I was just wondering whether there might be an error at 6:34 regard the 'S'.
Because S should be P[X] = Sum_to_n( P[ X | lambda = lambda_n ] ) = P(X| lambda_1) + P(X| lambda_2) + .... + P(X| lambda_n)
But in the video, S = L(lambda) = Sum_to_n( P[ X | lambda = lambda_n ] * P(lambda_n) = P(X| lambda_1)*P(lambda_1) + P(X| lambda_2)*P(lambda_2) + .... + P(X| lambda_n)*P(lambda_n)
Thanks in advance!
It is correct in the Video. Because of bayes, there must be P(X), we use the law of total probability to describe it as the sum of all marginal probabilitys P(X | lambda_n) thats a partition of your entire space. After that you need to know what weight it has in the probability space. In the end it gives us P(X)
Excellent explanation! The best I watched so far!I'd appreciate it if you could give some more explonations on MCMC, still struggling for it...
Well explained. Thank you!
crystal clear explains, and thank you !
very clear and helpful explanation! Many thanks for this video!
The only part I don't understand is in Step 4, where the denominator is the sum of all lambda ns ("S") - shouldn't the denominator be observed samples / data (sum of xs) per Bayes' Theorem?
Overall this was great and helpful. Thanks! I am tripped up though by couple of details. Lambda is a continuous variable between 1 and 6. One thousand values for lambda are captured in the histogram. At 5:35 you show than each of these one thousand values has a probability of one fifth. This is not intuitive or compelling. At the same point on the bottom you have 50 included in the multiplication. Why? At 6:39 what is it about Bayesian models that leads to the probability of lambda n in the posterior distribution being proportional to the likelihood of lambda n? Again not intuitive. And finally why should its proportionality factor be tied to the reciprocal of the sum of the 1000 likelihoods as opposed to anything else? Any pointers or clarifications will be greatly appreciated.
One fifth is the density of lambda when it is uniformly distributed from 1 to 6. dx is a tiny increment of x you use to convert density function to probability. It is eliminated when you calculate the ratio of L(lambda_n) and S. Baysian models can be found from
en.wikipedia.org/wiki/Bayes%27_theorem#Extended_form
For the last question, the 1000 likelihood can be understood as our existing knowledge of the sum likelihood of all possible lambda. The proportion gives you the probability of each particular lambda. Because it is continuous, dx is eliminated and so you have the pdf.
Why is the density 1/5? That's like saying the probability of each number when you roll a dice is 1/5...
Between, that is not the same because here the probability of less than or equal to 1 is 0.
Hi, your video is excellent. Could you please share the slide in you used in the video ?
@5:02 why is prob(lambda = lambda_n) = 1 / (6-1) ??? Why not prob(lambda = lambda_n) = 1 / 6 since it's uniform distributed over 6 values instead of 5 values???
Exactly the same caught my eye... however 1/6 is not correct either, if lambda ~ U(1,6) then prob(lambda = lambda_n) = 0!!! U(a,b) is a continuous probability distribution after all.
Nice explanation
Thank you so much! This saved so much of my time.
what does lambda actually mean?
Thanks for the great example. Could you please make an easy example on MCMC?
Reza53 Thank you for commenting. It will be uploaded soon!
Thanks Ray. That's very clear!
Great video. You should make more!
Thanks! Very clear!
hello pleas I went example for the bays estimation?
Thank you very much, this is the best explanation ever. What kind of book should I read on this topic ?
As far as I understand you do not talk about confidence intervals in Bayesian statistics, but credible intervals. Even though the interpretation is similar, they have fundamental differences that shows the different views in probabilities in the frequentist vs Bayesian interpretation.
Further I do not understand the point of sampling 1000 values of lambda. Could you not just compute the likelihood for linearly spaced values of lambda between 1 and 6?
crystal clear !
Gréât explanation thanks!!!
Ty
can you give me same code of yours for bayesian inference
thanks, its help a lot..
This is great
Can you send this code?
thx bro, it helps
Is the PDF of this presentation available somewhere? I am not so fond of watching videos for learning, I prefer reading it by myself
mute it and pause lol
good one mate !
It is good and really helpful. If possible, please give example with gamma prior or beta prior with coverage probability. Please tell how we can code same in matlab
What is pdf?????
Learn how to use google ;) -> probability density function
@@gesuchter thank you sarcastic son of a bitch, its something i'va already done
because of your no in-time help
@@sergiocastrocarrasco4797 Take a breath, I understand your outrage! You'll learn it, I trust in you
👍
👍