Posterior Predictive Distribution - Proper Bayesian Treatment!

Поделиться
HTML-код
  • Опубликовано: 23 окт 2024

Комментарии • 37

  • @kapilkhurana4343
    @kapilkhurana4343 День назад

    Kapil Sachdeva ji . Thanks very much for clearing my doubt over Bayesian equation inherently using marginal distribution.
    You really are a great teacher 🎉❤

  • @sanjaythorat
    @sanjaythorat Год назад +3

    This series has a wealth of knowledge on Bayesian stats/regression. It's worth spending time wrapping your head around it.

  • @rajibkhan9749
    @rajibkhan9749 Год назад +1

    Thanks. This helps a lot. please keep doing such great tutorials.

  • @thachnnguyen
    @thachnnguyen 9 месяцев назад +1

    Unbelievable how clear this is. I wish the book would just paraphrase _exactly_ what you explain here. Why do books have to be formal and obfuscating? Why not just say what you say here?

    • @KapilSachdeva
      @KapilSachdeva  8 месяцев назад

      🙏 Thanks for the kind words

  • @sibyjoseplathottam4828
    @sibyjoseplathottam4828 3 года назад +3

    This series is one of the best explanations I have seen on Bayesian regression. Thank you!

  • @stefanvasilev8948
    @stefanvasilev8948 Год назад +1

    Thank you so much Kapil! That was very helpful!

  • @rajns8643
    @rajns8643 Год назад +1

    This is a gem among gems. I finally understand the motivation behind bayesian inference. Thank you so much for these amazing tutorials! I am really grateful for to you for sharing your expertise in this field!

  • @roy1660
    @roy1660 3 года назад +1

    Wonderful explanation 🙏🙏🙏🙏

  • @adesiph.d.journal461
    @adesiph.d.journal461 3 года назад +1

    This lecture is amazing! A quick question around 19:50 when looking to use the product and sum rules to expand p(t',W|x,X,t) I expanded it as p(W|t'x,X,t)p(t'|x,X,t). The intuition behind this is as follows: assume p(t',W) this expands to p(W|t')p(t') now applying the condition over x,X,t it becomes p(W|t',x,X,t)p(t'|x,X,t). Intuitively I understand that W is independent of t' and x so those two are removed which makes it p(W|X,t) and similarly t' depends on W,x and independent of the dataset hence they are removed resulting in p(t'|W,x). But I just wanted to confirm is this intuition right especially where I am writing p(t'|x,X,t) as p(t'|x,W). Like just introducing W as a condition over t' based on intuition is that the correct reasoning or am I missing something.

    • @KapilSachdeva
      @KapilSachdeva  3 года назад +3

      If you think in terms of first expanding p(t’,W) to p(W)p(t’|W) instead of p(W|t’)p(t’) you will see that the rest of ur conditioning and adjustments will make more sense.
      Here is how I generally think about expanding joint probs.
      Given two events - A and B
      The joint prob will be p(A,B)
      According to product rule the joint prob could expand to either p(A)p(B|A) or p(B)p(A|B).
      Now we read these as follows … for the first possibility ie p(A)p(B|A) we say I have the prob of A and since A has already occurred I must take that as a condition of B for the second term.
      Now when we apply this kind of thinking on joint prob of p(t’, W) … we could argue that W has occurred first or was determined first before the new data t’ and hence the conditional prob should be for t’
      Now you can apply your conditions over x,X,t and you then do not have to make an argument that W is independent of t’.
      (Writing this comment on phone, hopefully not making mistakes with notations ;) )

    • @adesiph.d.journal461
      @adesiph.d.journal461 3 года назад +1

      @@KapilSachdeva ahh! I see this makes sense. Thank you so much! After a lot of effort managed to finish the first chapter. Heading to the next one!

    • @KapilSachdeva
      @KapilSachdeva  3 года назад +2

      🙏 I really admire that u want to understand all the details. Don’t dishearten if in a moment things appear to be hard; keep at it and this all will definitely pay up later. Good luck!

    • @adesiph.d.journal461
      @adesiph.d.journal461 3 года назад

      @@KapilSachdeva Thank you sir!

  • @azkahariz
    @azkahariz 11 месяцев назад

    Thank you very much. Excellent presentation of the material. Very simple, with clear images that are simple to remember.
    I have a question at 19:57, when it says:
    p(t,w'|x,x',t') = p(t|x,w',x',t')p(w'|x',t')
    However, if I break down the probability distribution p(t,w'|x,x't'), I get:
    p(t,w'|x,x',t') = p(t|x,w',x',t')p(w'|x,x',t')
    Should p(w'|x',t') and p(w'|x,x',t') be equal?

  • @yeo2octave27
    @yeo2octave27 3 года назад +1

    Thank you, this series is really amazing especially with the intuitive explanations. I just wanted to clarify something at 18:51. When you refer to posterior probability, it means the likelihood distribution right? As the posterior is weighted by the likelihood distribution to give the posterior predictive distribution.

    • @KapilSachdeva
      @KapilSachdeva  3 года назад +1

      The weighing is done with the help of posterior distribution. The second term in the integral (expectation) is the distribution whose values you use to weigh. When I say "posterior probability" it means that probability of w sampled from posterior distribution.
      In the expectation formulation, the first term is function of random variable. The second term is the probability distribution from which we get a sample/instance of the random variable. This sample/instance is passed as a parameter in the function of random variable (i.e. first term) and we also obtain the probability of this sample. We then multiply the output of function of RV and the probability of the sample.
      If still some confusion, please watch first few minutes of my tutorial on importance sampling (ruclips.net/video/ivBtpzHcvpg/видео.html) in which I explain the expectation formulation conventions in bit more detail.
      Hope this helps.

    • @yeo2octave27
      @yeo2octave27 3 года назад +1

      @@KapilSachdeva Thank you for the explanation, that was clear :)

    • @KapilSachdeva
      @KapilSachdeva  3 года назад

      🙏

  • @yogeshdhingra4070
    @yogeshdhingra4070 2 года назад

    I am trying to relate what you said at 18:47 ..in this way.. Lets say we have theree independent variables X1,X2,X3 and one dependent variable Y..so we will have three different weight parameters(w1,w2,w3) corresponding to X1,x2,x3. Each of the weight parameters will have its own distribution. Now for a new observation i, in order to find the Y for the ith observation...we will take each value of w1, w2, w3 and pass it to likelihood function..one at a time and then repeat this process until we pass all the possible values of w1, w2, w3 to the likelihood function...finally we will add everything to arrive at the posterior predictive distribution...is my understanding correct? Also will the likelihood function always be the sum of squared error function?

    • @KapilSachdeva
      @KapilSachdeva  2 года назад

      Your understanding is correct. Good job!

  • @Maciek17PL
    @Maciek17PL Год назад

    Just to make sure, if right hand side of the equation is an definite integral from -inf to inf I would get expected value (single point estimate) for the predicted value but if It's indefinite integral I would get a PDF of t?

    • @KapilSachdeva
      @KapilSachdeva  Год назад

      Sorry not sure if I understand your question. Could you please rephrase/elaborate?

    • @Maciek17PL
      @Maciek17PL Год назад

      @@KapilSachdeva my question is what exactly is posterior predictive, is it a point estimate or probability density function? because when you talk about it as marginalization, integral is indefine and when you talk about it from expected value perspective it's definite integral (deducing from this R subscript)

    • @KapilSachdeva
      @KapilSachdeva  Год назад +1

      Posterior predictive is a distribution.
      In Bayesian philosophy, we do not work with 1 statistic (mean, mode etc). You construct this posterior prediction distribution and then use the samples (with their respective probability) from it to get the expected value.

    • @Maciek17PL
      @Maciek17PL Год назад

      @@KapilSachdeva thank you for answering

  • @solymanhoseinpor825
    @solymanhoseinpor825 2 года назад

    Where can I see the previous parts 1to 4? thank you

    • @KapilSachdeva
      @KapilSachdeva  2 года назад +1

      ruclips.net/p/PLivJwLo9VCUISiuiRsbm5xalMbIwOHOOn

  • @fruityramen
    @fruityramen 2 года назад

    Do you have a video showing the derivation of the equation (1.69) from (1.68) in 22:34

    • @KapilSachdeva
      @KapilSachdeva  2 года назад +1

      I do not have video for that but you might be able to find the steps on internet somewhere. I have vague collection that I have seen the simplification of two normal pdfs somewhere.
      Can you wait few days? I will try to see if I can find it over the weekend.

    • @fruityramen
      @fruityramen 2 года назад

      @@KapilSachdeva sure no problem!