Pooled-Variance t Tests and Confidence Intervals: An Example

Поделиться
HTML-код
  • Опубликовано: 23 окт 2024

Комментарии • 30

  • @guillaumegiroux9425
    @guillaumegiroux9425 6 лет назад +10

    Dear M. Jbstatistics,
    I've struggling with a problem I was hopeless to solve and your ressources helped me solve it.
    I wanted to tell you:
    A BIG THANK YOU
    You're the boss ! Big thanks

  • @ege9069
    @ege9069 6 лет назад +1

    i just ended up here before my final exam of statistics 1. i just wanted to thank you with all my heart. Thanks and greetings from istanbul!

    • @jbstatistics
      @jbstatistics  6 лет назад

      You are very welcome! I hope your exam went well!

    • @ege9069
      @ege9069 6 лет назад

      it went really well sir. thanks a lot.

  • @timcrowley4191
    @timcrowley4191 5 лет назад +2

    Thank you for the video - a fantastic explanation of pooled variance t-tests - this has really helped me!

  • @jbstatistics
    @jbstatistics  11 лет назад +6

    There is no real advantage to doing a pooled-variance t test as a regression. You get exactly the same information using both methods. I made that video to illustrate 2 things:
    1) As a brief intro to including categorical variables in regression, which can be very useful in a multiple regression setting.
    2) To illustrate the relationship between the two methods, which may help students understand the model, the assumptions, and the proper interpretation of results.

  • @jbstatistics
    @jbstatistics  11 лет назад +1

    There is no need to do it as a regression. My "pooled-variance t test as a regression" video is part of my regression playlist, and it's a topic best discussed in regression. In regression, it's part of the bigger picture of including categorical explanatory variables in regression analysis. We may wish to include both quantitative and categorical variables as explanatory variables in a regression, and we include categorical variables by declaring appropriate indicator variables.

  • @stefanfarier7384
    @stefanfarier7384 Год назад

    This is so useful. Thank you!

  • @jbstatistics
    @jbstatistics  11 лет назад

    Although one can do a pooled-variance t test as a regression (and I have a video outlining that), here I've simply expressed this as an ordinary test of a difference in means between two groups. The number of groups has nothing to do with the sample size.
    If we did this as a regression, and coded our groups as X=0 and X=1, then values of X between 0 and 1 would be completely meaningless. The explanatory variable is categorical, and it does not make sense to discuss values between 0 and 1.

  • @FLEGA
    @FLEGA 3 года назад

    The t value you got was 1.980 @6:56 - from the table, t(DF = inf) = 1.960, t(DF = 99) = 1.984.
    How do I estimate t(DF=117) from the data given in the table? I know TI84 has invT(p, v) ; p = significance level, v = degrees of freedom.

  • @fin-pundit9631
    @fin-pundit9631 4 года назад

    Hi thanks for this video
    A quick question
    1)Suppose
    We are doing manually to get T Critical From Ttable instead of using Excel or any software
    Df=117
    But Highest Df is 1000
    And second highest df is 120
    And third highest df is 60
    Whcih one should I chose? or how should I chose ?
    to get t critical value
    2) I understood how to get confidence interval.
    Can I say as an analyst "95% of sample means of particular population lies between lower and upper limit.
    And remaining 5% the analyst is taking the risk saying that"5% of sample mean are not from the same population"
    please clarify

    • @jbstatistics
      @jbstatistics  4 года назад

      1) It's best to use software. If you're using a table, then either take the conservative approach (go down to 60 DF), linearly interpolate, or say "meh, 120 is close enough to 117 for me". There are pros and cons to each one of those, and an instructor might recommend any of them. Relying on software is best.
      2) No, we can't say that. Keep in mind that interpretations of confidence intervals always relate to the value of *parameters* and never sample statistics.

  • @panagiotisgoulas8539
    @panagiotisgoulas8539 11 лет назад

    Thanks for the feedback. One more question regarding this. Then practically what does the regression line accomplish in this situation. I mean you found the same results as your your hypothesis test but why would I wanna make a categorical regression line in the first place? Wasn't the whole point of making one so I can estimate the values on the line given some hypothetical x values?

  • @balajikalva188
    @balajikalva188 4 года назад

    One of the assumptions here is "Normally Distributed Populations" . I don't get what that exactly mean ? If we have an unknown population distribution ( which may not be normal ) , if we take large enough sample size , and plot the sample distribution plot then by CLT it will come close to Normal distribution as the sample size increases . Now , the sample distribution is Normal but population distribution is not ...so my doubt here is can I actually apply everything what said in this video to find the confidence interval . Someone please help

  • @panagiotisgoulas8539
    @panagiotisgoulas8539 11 лет назад

    Ok the whole comment was meant for this video The Pooled Variance t Test as a Regression .Sorry for the trouble I've downloaded the videos since its easier to pause and go back that's why the mess. Now is it possible to give me a better understanding why would use the regression line since as you mentioned yourself the points between 0 and 1 don't express anything. Besides of what you mentioned in that video is there any other practical meaning of plotting the regression? Sorry for the queries.

  • @thomaswoodall4505
    @thomaswoodall4505 Год назад

    how did you find this p value in R? is it not 1-pt(4.267,117)????? then double?

    • @jbstatistics
      @jbstatistics  Год назад

      Yes, that'll do it. (It'll also be part of the t.test output, if that's being used.)

  • @panagiotisgoulas8539
    @panagiotisgoulas8539 11 лет назад

    Ok I think overall I understand what you did. I have the following questions: Basically you quantified X and then you made your regression line. But is this regression line trustworthy. I mean basically it's made by only 2 xi so my sample is really small. Or because these xi come from a bigger initial sample before we don't mind? Also second question what do the other x between 0 and 1 express in this example when you made your regression line?

  • @justsomegirlwithoutamustac5837

    Since the sample size is large here, why don't we use Z statistics instead of T statistic?

    • @jbstatistics
      @jbstatistics  Год назад +1

      When sampling from normally distributed populations, the difference between z tests and t tests depends on whether the population standard deviations are known are not. Since the population standard deviations are pretty much never known, in real world situations tests like these are done as t tests. Sure, the t distribution gets very close to the standard normal distribution as the degrees of freedom increase, but that doesn't mean we should just jump to the approximation when it works reasonably well. If it's a t test, it's a t test, regardless of sample size.
      Your instructor (and other sources, including some texts) may say something different, so if you are in a stats course listen to your instructor to find our what you should do in your particular course.

    • @justsomegirlwithoutamustac5837
      @justsomegirlwithoutamustac5837 Год назад

      @@jbstatistics Thank you so much for your reply!

  • @bornhere13
    @bornhere13 4 года назад

    As mentioned, this is equivalent to regressing the outcome on a dummy variable. But in a regression context, we also assume equal variances. Correct?
    Also, is there an advantage to conducting a t-test in this fashion? I would think it is because you can invoke the Welch procedure. What least-squares regression methods are available when group variances are unequal?
    Thank you for any insight. I hope my questions make sense!

  • @mamou8759
    @mamou8759 5 лет назад

    instead of the t value i think you are looking at the t table, why is the t value 1.980 when in the t table for 95% we have for df 100 1.984 and for infinite 1.96, I was gonna use 1.96 but can someone tell me why it is 1.980

    • @jbstatistics
      @jbstatistics  5 лет назад +1

      I used software to get the actual value that we need, the one corresponding to 117 degrees of freedom. If you drop down to 100 DF in a table, you'll end up with a slightly larger value.

    • @mamou8759
      @mamou8759 5 лет назад

      @@jbstatistics ohhh okok ,thank you very much for the fast response

  • @oguzhanserce4480
    @oguzhanserce4480 6 лет назад

    is this paired t test?

    • @jbstatistics
      @jbstatistics  6 лет назад

      No. I work through an example of a paired t test here: ruclips.net/video/upc4zN_-YFM/видео.html

    • @oguzhanserce4480
      @oguzhanserce4480 6 лет назад

      i am confused with these terms. There is a t test. t test have subtopics as paired and unpaired for two samples. Paired t test have subtopics as pooled and non-pooled?

  • @IamRayz
    @IamRayz 4 года назад

    How the fuck did you calculate s1 and s2? Thats all i am looking for..