I am really grateful for having found this site. I just want a simple suggestion as to what type of statistics (Anova, chi-square, etc.) I should use to determine my goal below. I am truly confused yet optimistic for someone generous out there, and I would greatly appreciate any additional comments or suggestions to clarify or simplify my statement or claim here. Thank you in advance. "Among the randomly selected senior students in the eight (4 public and 4 private) US south-west states (California, Arizona, New Mexico, and Texas), their responses are unanimous (strongly agree, agree, neutral, disagree, strongly disagree) as regards participating in home gardening rather than school gardening."
Here the p-value for the mean of the data is equal to the number of parameters for the equation of the mean and the p-vaue for the fit of the data is equal to the number of parameters for the fit of the data. How does this related to the typical notion of p-values (probability we should not reject null hypothesis)? Also once F has been computed, is F the itself a P-value or a value that we need to compute a p-value?
You might want to start with the first video in this series (this is "part 2") because that will answer your questions about how the p-values are computed and what they mean: ruclips.net/video/nk2CQITm_eo/видео.html
Hii, could you do videos on the different GLM (Poisson, binomial,log, Tweedie) and also the link function and some R examples plsss? I loveee your video🙌🏼
@@statquest oh ok, interesting 🤔, I thought you manually did that, lol. Thanks 👍 for the information and clarification. I'm curious whether if you access your own video or do you have to pay RUclips to watch your own videos as well?
@@computerconcepts3352 I can watch them, so at first I had no idea what was going on. The content worked for me, but no one else. It was very confusing and stressful because I got a lot of negative comments. Ugh. A bad day in StatLand. :(
@@statquest oofty doof oof oof, I guess uploading videos to other platforms could help reduce damages against something like this but then I really hate how RUclips does random things like this. I remember one of my videos got deleted for the wrong reason and I lost a bunch of views. Fixing RUclips is actually one of my eventual goals and me watching your RUclips videos is my first step towards that goal 👍
Hi, thanks for the awesome video ... I have a question, I still cannot get my head around the concept of this matrix... you have the formula y= 1* 2,2 + 0*3.6 + residual of the value ... I thought that only residuals are going into calculation ... how do we get from this bunch of y= equations one below the other to the final calculation, how does it fit in there? I am missing a bridge there. As well once there is residual at the end of the y equation and other times there is not, is this annotation later used counting with this residuals? If I understand correctly SS(fit) = sum of all these y equations together .... and I don't get why we need this matrix there if we do only the sum of residuals anyway ... or if I follow the equation literally I have sum residual squared (nobody written is squared but I suppose so) plus mean (which depends if is on or off according to the matrix)?? Thanks a lot for your response
It is spreadt between 5:00-7:00. ... where you start to show y= ewuations and you end up with simple equation in 7 amd then you move on, I am confused how this fit together
@@pavoldzama4641 First, when we use the equations to calculate the exact y-axis value for each of the known data point, we include the residuals because the mean value + the residual = the exact y-axis coordinate for the original data value. However, when we use the equations to make predictions - say like someone asks us to predict gene expression for a new control mouse - and we don't actually know what that value is - then we leave the residual off the equation, since we don't know what the difference between the mean and the "true" value is. Thus, when the equation is used with known values for the y-axis, we include the residual. When the equation is used to make predictions (and we don't know the y-axis value) we leave off the residual. Now, the reason we keep track of the equations in a matrix is that we can change the coefficients in one place and see how they effect the residuals and thus, how they minimize ss(fit).
@@statquestthanks for explainig, I think maybe we misunderstand, I just could not connect the dost .... I think I get it now after watching again, you use this formula for fit and this should not include residual as we calculate residuals based on this... then I got co fused pretty much on those matrices but ypur follow up video saved the day! Great wotk, keep it up
Hello Josh, it's me again....I had a question concerning the Anova. I still don't understand why we have to use an Anova when we have to compare a reference condition to several drug treatments A-B-C done in the same experiment. For instance, I am only interested in the effect of drug A or drug B compared to a control condition. So I only need a T-test. Why should I have to do an Anova?
An ANOVA is just a generalized t-test. Technically, you can do an ANOVA with just 2 conditions, Drug A vs a control, and you'll get the exact same results as a t-test (which is why people call it a t-test).
In the end, doesn't the second design matrix assume that the first column is the baseline? Since it's always on ( column of 1s) .Thus interpreting the results with respect to the difference between the two groups rather than their magnitudes.
Pardon me if this is a dumb question , but I am having a hard time wrapping my head around the idea! Why would one prefer to do a linear regression for T test? How does it make this better?
It's actually the exact same test - there are just two ways to do it. The advantage of doing it this way (using regression) is that we have more flexibility - we can easily generalize it into an ANOVA test or even something more fancy.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
I have to say that I really appreciate the fact that this video is bringing me back to 2009 RUclips while teaching me stats. Thanks!
bam! :)
I cannot express how grateful I am for such wonderful videos!
Glad you like them!
Thanks for understanding non native speakers and your clear explanation
Thanks!
I am really grateful for having found this site. I just want a simple suggestion as to what type of statistics (Anova, chi-square, etc.) I should use to determine my goal below. I am truly confused yet optimistic for someone generous out there, and I would greatly appreciate any additional comments or suggestions to clarify or simplify my statement or claim here. Thank you in advance.
"Among the randomly selected senior students in the eight (4 public and 4 private) US south-west states (California, Arizona, New Mexico, and Texas), their responses are unanimous (strongly agree, agree, neutral, disagree, strongly disagree) as regards participating in home gardening rather than school gardening."
Look for 'Chi-square test of independence' and 'Friedman's test'
Thanks so much for posting this! We are going into two way ANOVA, I hope this helps
Good luck!!
wow! this is really good! we need more for means and effects parametrization!
Thanks!
Great video. Thank you for sharing!
Thank you!
Here the p-value for the mean of the data is equal to the number of parameters for the equation of the mean and the p-vaue for the fit of the data is equal to the number of parameters for the fit of the data.
How does this related to the typical notion of p-values (probability we should not reject null hypothesis)?
Also once F has been computed, is F the itself a P-value or a value that we need to compute a p-value?
You might want to start with the first video in this series (this is "part 2") because that will answer your questions about how the p-values are computed and what they mean: ruclips.net/video/nk2CQITm_eo/видео.html
Hii, could you do videos on the different GLM (Poisson, binomial,log, Tweedie) and also the link function and some R examples plsss? I loveee your video🙌🏼
I'll keep those topics in mind.
When I hear your voice, I want to make chemometric papers with you
:)
Thanks
Hooray! Thank you so much for supporting StatQuest! TRIPLE BAM! :)
Woah I was like that’s weird a video … and then it was like - that’s a a lot of videos BAMMMMM
yeah lol, idk if it's a re-upload though
It's a re-upload. For some reason RUclips put the originals behind a paywall...so I re-uploaded them so that they would still be free.
@@statquest oh ok, interesting 🤔, I thought you manually did that, lol. Thanks 👍 for the information and clarification. I'm curious whether if you access your own video or do you have to pay RUclips to watch your own videos as well?
@@computerconcepts3352 I can watch them, so at first I had no idea what was going on. The content worked for me, but no one else. It was very confusing and stressful because I got a lot of negative comments. Ugh. A bad day in StatLand. :(
@@statquest oofty doof oof oof, I guess uploading videos to other platforms could help reduce damages against something like this but then I really hate how RUclips does random things like this. I remember one of my videos got deleted for the wrong reason and I lost a bunch of views. Fixing RUclips is actually one of my eventual goals and me watching your RUclips videos is my first step towards that goal 👍
Hi, thanks for the awesome video ... I have a question, I still cannot get my head around the concept of this matrix... you have the formula y= 1* 2,2 + 0*3.6 + residual of the value ...
I thought that only residuals are going into calculation ... how do we get from this bunch of y= equations one below the other to the final calculation, how does it fit in there? I am missing a bridge there. As well once there is residual at the end of the y equation and other times there is not, is this annotation later used counting with this residuals?
If I understand correctly SS(fit) = sum of all these y equations together .... and I don't get why we need this matrix there if we do only the sum of residuals anyway ... or if I follow the equation literally I have sum residual squared (nobody written is squared but I suppose so) plus mean (which depends if is on or off according to the matrix)??
Thanks a lot for your response
What time point, minutes and seconds, are you asking about?
It is spreadt between 5:00-7:00. ... where you start to show y= ewuations and you end up with simple equation in 7 amd then you move on, I am confused how this fit together
@@pavoldzama4641 First, when we use the equations to calculate the exact y-axis value for each of the known data point, we include the residuals because the mean value + the residual = the exact y-axis coordinate for the original data value. However, when we use the equations to make predictions - say like someone asks us to predict gene expression for a new control mouse - and we don't actually know what that value is - then we leave the residual off the equation, since we don't know what the difference between the mean and the "true" value is. Thus, when the equation is used with known values for the y-axis, we include the residual. When the equation is used to make predictions (and we don't know the y-axis value) we leave off the residual.
Now, the reason we keep track of the equations in a matrix is that we can change the coefficients in one place and see how they effect the residuals and thus, how they minimize ss(fit).
@@statquestthanks for explainig, I think maybe we misunderstand, I just could not connect the dost .... I think I get it now after watching again, you use this formula for fit and this should not include residual as we calculate residuals based on this... then I got co fused pretty much on those matrices but ypur follow up video saved the day! Great wotk, keep it up
Hello Josh, it's me again....I had a question concerning the Anova. I still don't understand why we have to use an Anova when we have to compare a reference condition to several drug treatments A-B-C done in the same experiment. For instance, I am only interested in the effect of drug A or drug B compared to a control condition. So I only need a T-test. Why should I have to do an Anova?
An ANOVA is just a generalized t-test. Technically, you can do an ANOVA with just 2 conditions, Drug A vs a control, and you'll get the exact same results as a t-test (which is why people call it a t-test).
First, thank you for your awesome videos! Is there any advantage of General Linear Model over a simple ANOVA, for example?
ANOVA is just a specific type of General Linear Model.
In the end, doesn't the second design matrix assume that the first column is the baseline? Since it's always on ( column of 1s) .Thus interpreting the results with respect to the difference between the two groups rather than their magnitudes.
What time point, minutes and seconds, are you referring to?
@@statquest in the end you present an alternative design matrix that has full 1's on the first column
@@nizogos The answer to your question is "yes". This is explained in the follow up video on design matrices: ruclips.net/video/CqLGvwi-5Pc/видео.html
Pardon me if this is a dumb question , but I am having a hard time wrapping my head around the idea! Why would one prefer to do a linear regression for T test? How does it make this better?
It's actually the exact same test - there are just two ways to do it. The advantage of doing it this way (using regression) is that we have more flexibility - we can easily generalize it into an ANOVA test or even something more fancy.