Test Normality for ANOVA in R

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 8

  • @emredunder9108
    @emredunder9108 4 месяца назад

    This is wrong. It must be checked for each group seperately.

    • @tidydata
      @tidydata  4 месяца назад

      ANOVA uses the same method as linear regressions to check normal distributions: check the normal distribution of the model residuals. Refer to this online article (not written by me): www.theanalysisfactor.com/checking-normality-anova-model/
      Added additional comment: According to that article, you could check the normality for each group as well.

    • @emredunder9108
      @emredunder9108 4 месяца назад

      @@tidydata this is just a blog post. In one point you are right, they coincide but also Anova assumes that the data come from a normally distributed population. So each group must be normal.

    • @tidydata
      @tidydata  4 месяца назад

      @@emredunder9108 Thank you for the response. You need to ask why there is a need for such assumption, namely the assumption of normality. It is needed because we need to calculate the p-value, which is based on the normality of the residuals (i.e., the bell shape of the histogram plot for the residuals). On Wikipedia (en.wikipedia.org/wiki/Analysis_of_variance), it mentions the same thing: Normality - the distributions of the residuals are normal.

    • @emredunder9108
      @emredunder9108 4 месяца назад

      @@tidydata Yes, the assumption is required for the p value calculation. But what happens if your data are not normal in one group? The means can only be normal when the data are normal, but in each group.

    • @tidydata
      @tidydata  4 месяца назад

      @@emredunder9108 Hi, if you read carefully the blog post mentioned earlier (not written by me), it answers your question (full quotes shown below):
      “The distribution of Y within each group is normally distributed.” It’s the same thing as Y|X and in this context, it’s the same as saying the residuals are normally distributed.
      The concept of a residual seems strange in an ANOVA, and often in that context, you’ll hear them called “errors” instead of “residuals.” But they’re the same thing. It’s the distance between the actual value of Y and the mean value of Y for a specific value of X. Those distances have the same distribution as the Ys within that group.