The Surprisingly Effective Magic of Partial Pooling

Поделиться
HTML-код
  • Опубликовано: 9 сен 2024

Комментарии • 54

  • @gergerger53
    @gergerger53 2 года назад +15

    I am so glad I found your channel a few months ago. Doing a PhD in lockdown meant that 2 years of interesting chats with friends and colleagues in the coffee breaks or overhearing new methodologies was completely wiped out. Your videos, especially in the style of this one, really fill that gap for me and provide so many points and good informal explanations of things I'd expect would be happening had universities been open. It feels like when you take a break and have an academic chat with someone and you ask someone to explain something (but you have the added benefit of visualisations). Top stuff, man! Plus: cats!

    • @ritvikmath
      @ritvikmath  2 года назад +4

      I feel your struggle. I had to do most of grad school remotely and it really made me appreciate the value of an in-person education

    • @jaeheehwang5269
      @jaeheehwang5269 Год назад

      I also felt like this video resembles a casual explanation given by a peer in a University grad school setting. Loved it!

  • @RisetotheEquation
    @RisetotheEquation 2 года назад +4

    Now that is a cute cat!

  • @lashlarue7924
    @lashlarue7924 Год назад

    I love the way you say, "Bayesian". I say it like that too. 👍

  • @jurian0101
    @jurian0101 2 года назад +2

    Hi, this is famously how IMDB rating system works. Bayesianly. When a film only gets a few ratings, the score is weighted so that it can’t deviate very much from the average of all movie ratings. Only when the film gathers much more rating so some wisdom of the crowd can be relied on the weight is lessened. So a few angry hates or crazy fans can’t easily make a movie rated worst or best of all time.

    • @ritvikmath
      @ritvikmath  2 года назад +1

      Wow! I didn't know. Thanks!

  • @ahmedsalem8697
    @ahmedsalem8697 2 года назад +8

    This is really one of the best videos I have ever watched and enjoyed in a looong time. The idea, your explanation, and bringing Bayesian and regularization topics here are really awesome! Thanks a lot for that!

  • @jacobmoore8734
    @jacobmoore8734 2 года назад

    Thank you Mr. Bayesian Sensation for bringing the mathematical heat!

  • @gopsda
    @gopsda 7 месяцев назад

    Ritvik, Thanks for the video. The way you covered the partial pooling concept using regularization appeals to me.
    When we are doing the weighted average by partial pooling, we can also leave out those samples out of the pooled average (kinda leave one out method), that way we are allowing information from outside of sample only.

  • @lucasm4299
    @lucasm4299 2 года назад +1

    We covered this in my undergraduate Bayesian stats class. Pretty cool 👍

  • @jiachengli9061
    @jiachengli9061 2 года назад +3

    This concept is so similar to one of my research topics called "small area estimation" in statistics, the basic idea is also to borrow information from other areas when the sample sizes are relatively small. And btw, you are the best youtube I have ever seen! You convey the ideas and concepts easy to follow, I have recommended you to many of my friends already! Keep going!

    • @ritvikmath
      @ritvikmath  2 года назад

      Thank you for your kind words! And it's awesome to hear this is related to your work! This idea of borrowing information is still relatively new to me so it's amazing to hear when others have been using it as well

  • @jasdeepsinghgrover2470
    @jasdeepsinghgrover2470 2 года назад +2

    Next step.. Bayesian AB testing :-)

  • @BrianMburu326
    @BrianMburu326 2 года назад

    Credibility theory plays with the concept. Love the oranges BTW.

  • @carlosdesantiago1356
    @carlosdesantiago1356 2 года назад

    This is my new favorite channel!

  • @Kat-gp6gj
    @Kat-gp6gj Год назад

    Beautiful calico kitty

  • @agkol92
    @agkol92 2 года назад

    Love your style of explaining. Thanks a lot.

  • @ResilientFighter
    @ResilientFighter 2 года назад

    As always a huge fan of your videos and teaching style

  • @BreezeTalk
    @BreezeTalk 2 года назад

    Big fan of yours, recently got into my honours in stats and data science and my term just began. thanks to you i understood time series much better. thank you ritvik

  • @andrashorvath2411
    @andrashorvath2411 Год назад

    Great content. Alpha might be depended on the standard error of the mean as well. This way the uncertainty would be included much better than just considering the sample size. Thanks for your videos.

  • @junkbingo4482
    @junkbingo4482 2 года назад

    to fix the pb of skewness, you can use the median as well; well, everything is a pb of ' what is my goal, what are the available tools to achieve it, and for each tool what is the + and what is the -'

    • @junkbingo4482
      @junkbingo4482 2 года назад

      skewness and influence of a data on its mean, i meant

  • @yuckbutyup
    @yuckbutyup 2 года назад

    very helpful!

  • @C0DEWARR10R
    @C0DEWARR10R 2 года назад

    This is basically like buying insurance. You pay a tolerable premium (mildly inaccurate mean every time there is no outlier in the sample which is more often the case) in the hope that you don't lose your job that rare day when an outlier does come calling. It may be effective in various situations, but it certainly doesn't come for free.

  • @jessicatran5467
    @jessicatran5467 2 года назад +1

    need to try some of that oat milk

  • @ScottSummerill
    @ScottSummerill 2 года назад +8

    YES, very cool. That said, I have never understood using the mean when you could use the median. I get that the average is better recognized but like you said it's susceptible to outliers. Both are measures of central tendency. Why the mean?

    • @Darkev77
      @Darkev77 2 года назад +2

      I guess for well-behaved distributions the difference between the median (most common) and mean (best representation) is quite subtle, but if you assume a skewed distribution where most sample points lie at the ends of a distribution with a few samples in the middle, then the difference between the mean and the median get more drastic.

    • @andrashorvath2411
      @andrashorvath2411 Год назад

      On long term the mean gives the best prediction of expected value differing the least from the future values in total if there are no outliers or measure errors in the data. If there are then that's a different question. They need to be removed beforehand or other measures should be used.

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader 2 года назад

    Another gem

  • @pypypy4228
    @pypypy4228 Год назад

    Awesome video and concept! Would a bootstrapping (sampling with replacement) from a group of 5 samples be another option to tackle an outliers problem?

  • @anarok9674
    @anarok9674 2 года назад +3

    Hi @ritvikmath, thanks for a great video. I have a question - is there some mathematical way to compute Alpha based on sample sizes, or is it a tunable hyperparameter?

  • @Droobilicious
    @Droobilicious Год назад

    Great video mate. Got a subscribe from me.

  • @prentonc
    @prentonc 2 года назад

    Hey, brilliant videos. Any chance we’ll get a series on causal modelling?

  • @jessicatran5467
    @jessicatran5467 2 года назад

    YOUR CAT!!!!

  • @Darkev77
    @Darkev77 2 года назад +4

    But by this technique, aren't we implicitly assuming a priori that these different systems (coffee shops) are somewhat related/dependent on each other, i.e., we're assuming that they collectively behave in a similar way (they usually serve hot coffee rather than cold)?

    • @ritvikmath
      @ritvikmath  2 года назад +4

      Great point and two follow ups:
      - with larger sample sizes per cafe, the pull towards the pooled average gets weaker and weaker so we gradually relax that similarity assumption between cafes
      - with small sample sizes we are indeed making an assumption that the temperatures at the various shops are close. This very well could be wrong but the logic is that since all our coffee shops are in the same city and in the same food & drink category, we assume the temperatures cannot be wildly different

    • @user-sl6gn1ss8p
      @user-sl6gn1ss8p 2 года назад

      @@ritvikmath couldn't you incorporate the fact that maybe a coffee shop is really an outlier by taking into account how much it's own samples agree?
      Say a coffee shop has all of it's 5 samples at between two and four standard deviations of the global average, that likely means the shop itself is different and the global average should get as much weight.

    • @ritvikmath
      @ritvikmath  2 года назад +2

      that seems like a good idea for medium sample sizes! (I'm thinking like 5 - 10)

    • @Darkev77
      @Darkev77 2 года назад

      @@ritvikmath thanks so much for the clarification!

    • @Darkev77
      @Darkev77 2 года назад

      @@user-sl6gn1ss8p or why don’t we consider intra-distribution (within class) standard deviation. If it’s comparatively higher for some coffee shop “X”, then we reduce its alpha (weighted avg) value accordingly, else if it’s comparatively lower, then we’re more confident of that coffee shop and hence make its alpha bigger (less dependent on the pooled mean).

  • @sayeedmuratbekov887
    @sayeedmuratbekov887 2 года назад

    Cat is adorable! What is it name?

  • @newwaylw
    @newwaylw Год назад

    So why don't we identify the outliers in each group? e.g. ignore values that's x sd away from the mean of each group?

  • @Septumsempra8818
    @Septumsempra8818 Год назад

    Can we get some code for this???
    🇿🇼🇿🇦

  • @badrelhamzaoui6314
    @badrelhamzaoui6314 2 года назад

    Hi, i want to sak you about a research
    that i m doing over the efficiency of financial market,
    my residual check test by ljung box for my ARIMA model was like this
    p-value = 0,02 ... what is the correct interpretation to the
    efficiency of the market ?

  • @gravious
    @gravious 2 года назад

    Would this be a good way of estimating an overall average with a goal to finding outliers that could live outside of the mean average of the individual shop? Or a shop that as a whole would be an outlier within the larger set?

  • @shyft09
    @shyft09 2 года назад +1

    Seems kind of dangerous to me, wouldn't it artificially smooth everything out? What if one of these cafes really did serve cold coffee, all 5 coffees were consistently cold, not just one outlier - then those results would also be mixed in to the average of the other cafes making that one cafe look less unusual than it is in reality. Could you weight the pooling based not just on the sample size of the one cafe, but also the spread of the results from that one cafe maybe? (I.e. small sample size bad, but a tighter range good)

    • @gopsda
      @gopsda 7 месяцев назад

      I had the same thought. What if if we do a 'Leave one out' policy while computing the regularized mean?
      (cut and paste from my previous comment)
      When we are doing the weighted average by partial pooling, we can also leave out those samples out of the pooled average (kinda leave one out method), that way we are allowing information from outside of sample only.

  • @MyMy-tv7fd
    @MyMy-tv7fd 2 года назад

    overall, not really, empirical data is data, outliers and all