Lecture21 (Data2Decision) Leverage in Regression

Поделиться
HTML-код
  • Опубликовано: 12 дек 2024

Комментарии • 14

  • @looollol7910
    @looollol7910 5 лет назад +2

    Thank you so much!!! you are very clear and helpful!!

  • @zikviewsdotcom5827
    @zikviewsdotcom5827 6 лет назад +2

    Count Dooku is the best statistics teacher

  • @muonneutrino
    @muonneutrino 4 года назад

    Great lectures! Very clear. Too bad I haven't learned outlier detection in regression models, although studied B.Sc. in computer engineering. I have some questions, hope you can answer them. 1) Why residuals should be normally distributed? 2) In Williams Graph, why do we use 2 means as a threshold? I would expect to see a multiply of stdev(lev * n/p). I watched the following lecture and I saw you calculated Cook's Distance as well, but you didn't use it for filtering outliers, or I missed it? Thank you so much for this quality content!

    • @muonneutrino
      @muonneutrino 4 года назад

      Oh, sorry you have a dedicated lecture about residuals distribution. So it's pretty much empirical, as I understood it.

    • @chrismack783
      @chrismack783 4 года назад +1

      1) residuals are often non-normally distributed, but sometime they are normal. You should always check if the assumption of normality makes a difference in your statistical analysis. 2) The choice of twice the average leverage as a threshold is arbitrary, but a convenient rule of thumb.

    • @muonneutrino
      @muonneutrino 4 года назад

      Chris Mack thank you Chris! Just now I saw that you have experience in semiconductors industry :) what a coincidence! I’m analyzing correlation in CD measured on wafer in different fields. It looks like CD distribution is normal. At least sometimes Jacque-Bera test confirms it, sometimes not. Sometimes Shapiro-Wilk confirms it sometimes not. Thank you so much for your great lectures! They are very helpful!

    • @chrismack783
      @chrismack783 4 года назад

      @@muonneutrino Good luck - I've worked a lot in mapping CD across the wafer.

    • @muonneutrino
      @muonneutrino 4 года назад

      Chris Mack very interesting. While there are many factors contributing to CD, you can tune the mask to compensate for them, or at least most of them. From your experience there is good correlation between different fields if CD is measured on the same locations?

  • @krishnaiyer2556
    @krishnaiyer2556 3 года назад

    sir difference between multicollinearity and leverage vs perfect collinearity in x variables?

  • @krishnaiyer2556
    @krishnaiyer2556 3 года назад

    cov can be negative, so do we take absolute values?

  • @massimo8740
    @massimo8740 3 года назад

    Thanks!!!

  • @krishnaiyer2556
    @krishnaiyer2556 3 года назад

    defining leverage as distance between x's, but formula says cov (that,actual)/ var(y) why so?

  • @empaulstube6947
    @empaulstube6947 4 года назад

    What is basically the difference between Standardized and Studentized residuals?

    • @ChrisMack
      @ChrisMack  4 года назад +1

      See slides 9 and 10: "standardized" is the same as "internally studentized", which is different from "externally studentized".