Regression diagnostics and analysis workflow

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 37

  • @BrinderSadler
    @BrinderSadler 2 месяца назад

    A very informative video that is clear and uses examples so that viewers can better follow. Thank you.

    • @mronkko
      @mronkko  2 месяца назад

      You are welcome!

  • @THEPSYCHOTIC
    @THEPSYCHOTIC 4 месяца назад

    I have been trying to find workflow videos on regression analysis for a while now, this is the first (and only one) that I found. It helped me immensely, thank you.

    • @mronkko
      @mronkko  4 месяца назад +1

      You are welcome. It is surprising that very few people teach how to actually use the analyses in empirical research practice.

    • @THEPSYCHOTIC
      @THEPSYCHOTIC 4 месяца назад

      @@mronkko that's true. Most videos cover only interpretation of results or are focused on let's say one part of the analysis but no one covers the whole process in a single video, with a single dataset.
      Just an idea - You could maybe consider doing a workflow series focusing on how to do analysis with different combinations of explanatory/response variables? Let's say one categorical explanatory variable, 1 exp and 1 quantitative, 2 categorical exp var, and so on. And the same logic with explanatory - quantitative vs qualitative. I'm not sure if you've done it already, but it'd be so so helpful!
      Thanks again, keep up the good work. I wish you good luck!

  • @newtonocharimenyenya2458
    @newtonocharimenyenya2458 3 года назад +2

    A Great Piece. Simple to understand.

    • @mronkko
      @mronkko  3 года назад

      Glad you think so!

  • @magnusjensen5867
    @magnusjensen5867 3 года назад +2

    Best explabation I’ve come across on RUclips! Keep up the good work

    • @mronkko
      @mronkko  3 года назад

      Glad it helped!

  • @Youtuube304s
    @Youtuube304s 3 месяца назад

    Subscribed. Very good

    • @mronkko
      @mronkko  3 месяца назад +1

      You are welcome.

  • @bezaeshetu5454
    @bezaeshetu5454 2 года назад

    Thank you for the nice and clear explanation.

    • @mronkko
      @mronkko  2 года назад

      You are welcome!

  • @whx2044
    @whx2044 3 года назад

    Thank you for teaching !

    • @mronkko
      @mronkko  3 года назад

      You are welcome.

  • @newtonocharimenyenya2458
    @newtonocharimenyenya2458 3 года назад

    A very Great piece.

  • @harijha6279
    @harijha6279 Год назад

    best explanation

    • @mronkko
      @mronkko  Год назад

      Good that you liked it!

  • @rutwikkadane2409
    @rutwikkadane2409 3 года назад

    Thanks for the explanation!

    • @mronkko
      @mronkko  3 года назад

      Glad it was helpful!

  • @faemillongo6839
    @faemillongo6839 2 года назад

    Thanks. So clear

    • @mronkko
      @mronkko  2 года назад

      Happy that you find it helpful. The lack of reporting that regression diagnostics were done is a big problem in published research. And this would be so easy to fix. Pay attention to your model assumptions and justify them.

  • @ltang
    @ltang Час назад

    Around 7:49 are farmers less prestigious than the model predicted or more? What does sitting below the y=x line mean?

    • @mronkko
      @mronkko  Час назад +1

      They are more prestigious. Check the residual on the y-axis. Anything above zero is respected more than what the model predicts.

    • @ltang
      @ltang 55 минут назад

      @@mronkkoSo it is below the y=x line just means that theoretically on that percentile we would expect the residual to be even higher? What does the theoretical percentile mean?
      Is it just based on rank

  • @statistikochspss-hjalpen8335
    @statistikochspss-hjalpen8335 Год назад

    Great video.
    My question is what to do when ln transformation doesn't help?
    Imagine a regression with only Likert scale variables (1-5). Customer satisfaction as the dependent variable and product quality, customer service as independent variables. Most customers score 4 or 5 on the all variables. Almost all of the MLR assumptions are not met. How to approach the problem?
    I read about PLS being an alternative instead of OLS, but my coefficients are almost identical with both OLS and PLS (don't know if it's because of a fairly big dataset, n=8000).

    • @mronkko
      @mronkko  Год назад

      If your scales are poorly calibrated so that you get just 4s and 5s in a 1-5 scale, then I do not think that there is anything that you can do except to collect better data.
      How to approach the "allmost all assumptions are not met": I would start by looking at a specific assumption first and what you can do about it. For example, if the relationships are not linear, then I would start thinking about using nonlinear functional forms.

    • @statistikochspss-hjalpen8335
      @statistikochspss-hjalpen8335 Год назад

      @@mronkko Thank you for taking the time to respond. The data is real and based on real customers. The satisfaction metric (dependent variable) is already well established in the industry. If I'm interpreting my normal probability (y axis shows percent and x axis shows residual) plot it looks like 7% of the observations are off the line. The residuals go from minus 10 to positive 5.
      The residual vs fits, the residuals slope downwards as the fitted value increases.

    • @mronkko
      @mronkko  Год назад

      @@statistikochspss-hjalpen8335 If the residual slopes downward, then you might have nonlinearity and you need to consider other functional forms.
      The fact that a measure is well-established does not necessarily mean that the data are good. For example if you want to assess the effect on persons height on persons weight, but only measure people between 180 and 181 cm, then normal measurement tape would not suffice because it is not precise enough. The same can happen in your data, if you have little variation in satisfaction you might need a measure that is calibrated differently. I think I talk about measurement calibration in one of the measurement presentations, but I am not 100% sure about that.

  • @zwan1886
    @zwan1886 2 года назад

    In your AV plots around 15:00 isn't it showing that the women regressor doesn't add anything to the model?

    • @mronkko
      @mronkko  2 года назад +1

      Yes. that is what the model shows. Also he regression coefficient in the table at 2:58 shows that the effect of women is nonsignificant.

  • @kar2194
    @kar2194 2 года назад

    Hi Thanks for the content! 3:09, you said you have a video of the regression coefficient, I can't find it, I would like to check it out :)

    • @mronkko
      @mronkko  2 года назад +1

      Good question. The videos are from a course that I run and I have organized them as RUclips playlists. This video is from the third study unit and the video that I refer to is from the second unit:
      ruclips.net/video/kKE1-iGiywk/видео.html

    • @kar2194
      @kar2194 2 года назад

      @@mronkko Thanks!

  • @auddssey
    @auddssey Год назад

    i want to see the r code for residual vs leverage plot, how the occupation outliers appear :-)

    • @mronkko
      @mronkko  Год назад

      The slides are linked in the video description and contain some R code in the slide notes
      library(car)
      data(Prestige)
      reg1