Hi Dr Zhang, Great job in explaining! The best explanation so far. I was just wondering whether there might be an error at 6:34 regard the 'S'. Because S should be P[X] = Sum_to_n( P[ X | lambda = lambda_n ] ) = P(X| lambda_1) + P(X| lambda_2) + .... + P(X| lambda_n) But in the video, S = L(lambda) = Sum_to_n( P[ X | lambda = lambda_n ] * P(lambda_n) = P(X| lambda_1)*P(lambda_1) + P(X| lambda_2)*P(lambda_2) + .... + P(X| lambda_n)*P(lambda_n) Thanks in advance!
It is correct in the Video. Because of bayes, there must be P(X), we use the law of total probability to describe it as the sum of all marginal probabilitys P(X | lambda_n) thats a partition of your entire space. After that you need to know what weight it has in the probability space. In the end it gives us P(X)
Thanks for the video. Just wanna clarify, the likelihood function you mentioned actually should be the likelihood*prior instead. Can be noted as p(x|lambda)*p(lambda). Only first part is the likelihood. 😂
Exactly the same caught my eye... however 1/6 is not correct either, if lambda ~ U(1,6) then prob(lambda = lambda_n) = 0!!! U(a,b) is a continuous probability distribution after all.
Overall this was great and helpful. Thanks! I am tripped up though by couple of details. Lambda is a continuous variable between 1 and 6. One thousand values for lambda are captured in the histogram. At 5:35 you show than each of these one thousand values has a probability of one fifth. This is not intuitive or compelling. At the same point on the bottom you have 50 included in the multiplication. Why? At 6:39 what is it about Bayesian models that leads to the probability of lambda n in the posterior distribution being proportional to the likelihood of lambda n? Again not intuitive. And finally why should its proportionality factor be tied to the reciprocal of the sum of the 1000 likelihoods as opposed to anything else? Any pointers or clarifications will be greatly appreciated.
One fifth is the density of lambda when it is uniformly distributed from 1 to 6. dx is a tiny increment of x you use to convert density function to probability. It is eliminated when you calculate the ratio of L(lambda_n) and S. Baysian models can be found from en.wikipedia.org/wiki/Bayes%27_theorem#Extended_form For the last question, the 1000 likelihood can be understood as our existing knowledge of the sum likelihood of all possible lambda. The proportion gives you the probability of each particular lambda. Because it is continuous, dx is eliminated and so you have the pdf.
As far as I understand you do not talk about confidence intervals in Bayesian statistics, but credible intervals. Even though the interpretation is similar, they have fundamental differences that shows the different views in probabilities in the frequentist vs Bayesian interpretation. Further I do not understand the point of sampling 1000 values of lambda. Could you not just compute the likelihood for linearly spaced values of lambda between 1 and 6?
The only part I don't understand is in Step 4, where the denominator is the sum of all lambda ns ("S") - shouldn't the denominator be observed samples / data (sum of xs) per Bayes' Theorem?
It is good and really helpful. If possible, please give example with gamma prior or beta prior with coverage probability. Please tell how we can code same in matlab
I signed up this account in RUclips just to say that this video is so helpful and clear!
This the best video on understanding basics of Bayesian Inference
It helped a lot to understand statistical inference. Thank you! The contrast against minimum square error did it for me
Ty man, missed class due to illness and this cleared up a lot 🙏
Excellent video!
Thanks for share with us
Greetings from chile
thank you, your lession very helpful and interesting!!!
very clear and helpful explanation! Many thanks for this video!
Thank you so much! This saved so much of my time.
Hi Dr Zhang, Great job in explaining! The best explanation so far.
I was just wondering whether there might be an error at 6:34 regard the 'S'.
Because S should be P[X] = Sum_to_n( P[ X | lambda = lambda_n ] ) = P(X| lambda_1) + P(X| lambda_2) + .... + P(X| lambda_n)
But in the video, S = L(lambda) = Sum_to_n( P[ X | lambda = lambda_n ] * P(lambda_n) = P(X| lambda_1)*P(lambda_1) + P(X| lambda_2)*P(lambda_2) + .... + P(X| lambda_n)*P(lambda_n)
Thanks in advance!
It is correct in the Video. Because of bayes, there must be P(X), we use the law of total probability to describe it as the sum of all marginal probabilitys P(X | lambda_n) thats a partition of your entire space. After that you need to know what weight it has in the probability space. In the end it gives us P(X)
Thanks for the video. Just wanna clarify, the likelihood function you mentioned actually should be the likelihood*prior instead. Can be noted as p(x|lambda)*p(lambda). Only first part is the likelihood. 😂
You are a godsend. Thanks so much!
Thanks, it's very intuitive!
Excellent explanation! The best I watched so far!I'd appreciate it if you could give some more explonations on MCMC, still struggling for it...
@5:02 why is prob(lambda = lambda_n) = 1 / (6-1) ??? Why not prob(lambda = lambda_n) = 1 / 6 since it's uniform distributed over 6 values instead of 5 values???
Exactly the same caught my eye... however 1/6 is not correct either, if lambda ~ U(1,6) then prob(lambda = lambda_n) = 0!!! U(a,b) is a continuous probability distribution after all.
Great explainations
crystal clear explains, and thank you !
Well explained. Thank you!
Thanks Ray. That's very clear!
Great video. You should make more!
Overall this was great and helpful. Thanks! I am tripped up though by couple of details. Lambda is a continuous variable between 1 and 6. One thousand values for lambda are captured in the histogram. At 5:35 you show than each of these one thousand values has a probability of one fifth. This is not intuitive or compelling. At the same point on the bottom you have 50 included in the multiplication. Why? At 6:39 what is it about Bayesian models that leads to the probability of lambda n in the posterior distribution being proportional to the likelihood of lambda n? Again not intuitive. And finally why should its proportionality factor be tied to the reciprocal of the sum of the 1000 likelihoods as opposed to anything else? Any pointers or clarifications will be greatly appreciated.
One fifth is the density of lambda when it is uniformly distributed from 1 to 6. dx is a tiny increment of x you use to convert density function to probability. It is eliminated when you calculate the ratio of L(lambda_n) and S. Baysian models can be found from
en.wikipedia.org/wiki/Bayes%27_theorem#Extended_form
For the last question, the 1000 likelihood can be understood as our existing knowledge of the sum likelihood of all possible lambda. The proportion gives you the probability of each particular lambda. Because it is continuous, dx is eliminated and so you have the pdf.
Why is the density 1/5? That's like saying the probability of each number when you roll a dice is 1/5...
Between, that is not the same because here the probability of less than or equal to 1 is 0.
Nice explanation
Thanks for the great example. Could you please make an easy example on MCMC?
Reza53 Thank you for commenting. It will be uploaded soon!
Gréât explanation thanks!!!
Hi, your video is excellent. Could you please share the slide in you used in the video ?
Thank you very much, this is the best explanation ever. What kind of book should I read on this topic ?
Thanks! Very clear!
As far as I understand you do not talk about confidence intervals in Bayesian statistics, but credible intervals. Even though the interpretation is similar, they have fundamental differences that shows the different views in probabilities in the frequentist vs Bayesian interpretation.
Further I do not understand the point of sampling 1000 values of lambda. Could you not just compute the likelihood for linearly spaced values of lambda between 1 and 6?
The only part I don't understand is in Step 4, where the denominator is the sum of all lambda ns ("S") - shouldn't the denominator be observed samples / data (sum of xs) per Bayes' Theorem?
thanks, its help a lot..
It is good and really helpful. If possible, please give example with gamma prior or beta prior with coverage probability. Please tell how we can code same in matlab
good one mate !
hello pleas I went example for the bays estimation?
Is the PDF of this presentation available somewhere? I am not so fond of watching videos for learning, I prefer reading it by myself
mute it and pause lol
Thanks a lot!
what does lambda actually mean?
can you give me same code of yours for bayesian inference
thx bro, it helps
crystal clear !
This is great
Can you send this code?
What is pdf?????
Learn how to use google ;) -> probability density function
@@gesuchter thank you sarcastic son of a bitch, its something i'va already done
because of your no in-time help
@@sergiocastrocarrasco4797 Take a breath, I understand your outrage! You'll learn it, I trust in you
Ty
👍
👍