NOTE: When I first made this video, I was thinking about how R-squared relates to Linear Regression, which will not fit a line worse than the mean of the y-axis values. This is because if the values along the x-axis are truly useless in terms of predicting y-axis values, then the slope of the line used to make predictions will be 0, and the intercept will equal the mean. However, it is possible to simply draw a line that fits the data worse than the mean and get a negative R^2. Support StatQuest by buying my books The StatQuest Illustrated Guide to Machine Learning, The StatQuest Illustrated Guide to Neural Networks and AI, or a Study Guide or Merch!!! statquest.org/statquest-store/
With enough variables in the data set, it would be easy to create a set of r-squared values so that the cumulative percent "explained by" the different variables goes over 100%. That's why I was never a fan of that terminology. Students think it implies causation when it doesn't. Otherwise, great video.
@@mattkilgore7323 Maybe I should have made it more clear, but if you have a large model with a lot of variables, then you don't add together a bunch of individual R-squared values to find the total R-squared. You calculate a single r-squared value fro the entire model. In other words r-squared refers to the models, not the individual variables.
The phrase "explained by" can be deceptive, as students often think it means "caused by." But this is not what it means in the context of r-squared. Does that help?
Thank you so much for making this sooooooo clear, I've struggled to understand the meaning of R2 for a week and you just made it clear to me in 10 min.
You have explained the concept so neatly,clearly ( most importantly in an easier manner ) so that one could get deeper understanding of the concept, a fact that lot many text books / videos / articles failed to do. Keep making such videos !
I started following you 4 months ago, now I'm starting over from the very first video, I'll watch them all and understand everything. Thank you very much for this content.
Crazy when you consider they could have just explained it in terms of basic maths. What's X^2 well. . . It's just x*x. It's frustrating when instructors don't teach the basics assuming everyone gets it off rip.
I read a lot on R square from different books and articles but this was the really different and very intuitive approach. Visualization is the best way to understand statistics and I think most books lack there.
Your videos are so easy to understand, and also explains the intuition behind. I really love the way you start the video, unlike other bouring lectures.
I have been looking at a variety of stats videos and these are clearly the best. I am so impressed with StatQuest that I renamed my four dogs, "StatQuest," "StatQuest," "StatQuest," and "John Stamos" because, of course...
I have found a great channel for stats and trix.... BAM! it covers all the areas I want to learn.. Double BAM!! It's indeed clearly explained... Triple BAM!!!
"time spent sniffing a rock"! had me cracking😂.... btw thanks josh for putting such great content up... this channel is the my primary source of building my statistics foundations....
I never comment, but today is not that day. Thank you so much for this!!!! I am in graduate school and am still struggling to understand these concepts, you're a life saver
Thank you for this easy to understand video :-) I have two suggestion! - Time 0:50 -- instead of `strongly related` it is better to say `strongly linear related`! We know that `R` can't explain nonlinear relationships (e.x. Y = X^2)! - Time 10:00 -- instead of `0.7^2 = 0.5` it is better to say `0.7^2 \approx (is approximately equal to) 0.5` ;-)
Interestingly, and little known, but R^2 can be calculated for equations like y = a + b*x^2. That equation makes a curve, which is not linear, but the equation is _linear in its parameters_ (the parameters are 'a' and 'b', not 'x^2'), and that is what makes a "linear model" linear. A linear model doesn't have result in a straight line, but it must be linear in its parameters. That means you can calculate R^2 for y = a + b*x^2 or even y = a + b*sin(x). Not many people know this though since they don't understand what the "linear" in "linear models" actually refers to.
Adding to my previous comment , R2 value can be negative when the variance explained by the line is lesser than the variance explained by mean. For example var(mean) = 30 and var(line) = 40 Then R2 = -0.3 There exists such models , perhaps that could be worst models.
This is technically correct, but practically speaking, R-squared is always positive because it is used to compare the least squares residuals for the best fitting model to the least squares residuals for the mean, and the best fitting model can't have larger residuals than the mean, otherwise the best fitting model would be the mean. Does that make sense?
Completely agree with you in terms of practicality. It doesn't make sense at all. At the end of day you want a model which performs better than the base model. My point was it can be negative. Nevertheless i really like your videos. That comment of mine was just to clarify my understanding and to reach out to you.
I was thinking more about the negative R-squared and how it could be used in practice. I mean, like you said, even if your model is terrible, worse than the mean, it still might be nice quantify how terrible it is - and that's where the negative R-squared could come in handy. It still has the same meaning, except now you're quantifying how much worse your model is than the mean. Interestingly, it still works out even if var(terrible model) is so bad that the R-squared is less than -1. For example, if var(mean) = 50 and var(terrible model) = 100, then R-squared = (50 - 100) / 50 = -1, so "terrible model" is 100% worse than the mean. If var(terrible model) = 150, then R-squared = (50 - 150) / 50 = -2, and now terrible model is 200% worse.
Right , That's my point. From my own experience , I used to train multiple models on a sample dataset and compute their respected R-squared value to choose the best among those models. There I encountered some models returning negative R-squared value. Those models are practically useless and if you agree that happens when your training data is so huge and the algorithm you are using is so insignificant, like using a multi variant regression for a heavily skewed target variable.That was the motivation behind my comment. I appreciate your time to reply back to my comments. I am glad that it grabbed your attention Mr. Josh.
@@statquest I asked a question about this too and I assumed you meant the best fitting line (even though it was not explicitly stated in the video), or at least one that performed better than the mean line.
this is the best channel ever that can exist about statistics :D wonderful explanation and illustrations and the music! :) am glad I found this at the right time !
@@statquest Obscure stuff, my postdoc actually, but it makes everything - better. I've personally tested a lot of things with it from images, to soil samples, languages, FTIR spectra... And it gives the edge every time.
Came here from the Pearson's correlation video. Thank you so much for this I just wish that you could show in the video: • how (Var(mean)-Var(line)) / Var(mean) is equal to [Covar(x,y) / (Var(x)^-2)(Var(y)^-2)]^2 • whether (Var(mean)-Var(line)) / Var(mean) using mean and differences from the x-axis also yields the same value Again, thank you for the video
The follow up on this video gives more examples on R^2 that really helped my understanding of the concept: ruclips.net/video/nk2CQITm_eo/видео.html, see around 8:03 of that video
(8:30) R^2 is square of R only when you are fitting a linear regression line. Apparently, the square relationship does not hold for regressions with quadratic term(s).
You goddamm beautiful man, Im eating your videos like candy nowadays, Im finishing an electrical and comms engineering degree and working with some computer science and I usually get hammered with statistical questions when I finish presenting my models, thanks to your uploads i've held my own against some nasty expert old timers, thank you for this.
Great videos! What if you get a model that has an ok correlation (with a very significant P-value) but a low R^2? Can this still be meaningful? IE. There is a statistically significant relationship between the two variables, however there are probably other correlated variables as well so this model is not good for making accurate predictions, but it does let us know these two variables are correlated which can still be useful?
I'm not sure I understand your question because you can't have an OK correlation but a low R^2. The two are linked (the square root of R^2 is the correlation).
@@statquest I think this is what I was trying to get at maybe? blog.minitab.com/en/adventures-in-statistics-2/how-to-interpret-a-regression-model-with-low-r-squared-and-low-p-values. Perhaps I have correlation and regression mixed up. I thought they were the same thing kind of. I'm trying to see if we can potentially have a model that is a poor predictor because there is tons of variation in the data but there is still a significant relationship between the dependent and independent variables? And this can still be useful?
@@scottcooke5641 Yes, you can definitely have a small p-value and a small r^2 value - if you have enough data, that's what happens. So having a small p-value is not enough to say something is interesting or biologically relevant. You need to have a small p-value and a relatively large r^2 for the result to be interesting. I talk about this is one of my other videos but I can't remember which one. That said, if you want to learn about regression, check this out: ruclips.net/video/nk2CQITm_eo/видео.html
Hi Josh Starmer Thanks a lot for the explaination. By eye we can see that variation around the mean is higher than variation around the blue line. @ 6 : 15 it is mentioned that size - weigth relationship accounts for the 81% of total variation in the data. However, i feel its the otherway. Variation around the mean contributes higher percentage to the total variation in the data which is 81 % that has been reduced by considering variation around the mean. This means to say that size-wieight relationship contributes to 19 % variation in the data. I have used the word "contributes" instead of "accounts for". Please correct me If am wrong.
I think there are two potential problems with your alternative phrasing. 1) To say "height contributes 19% of the variation in weight", or even " the height-weight relationship contributes 19% of the variation in the data" is to suggest that "height" causes 19% of the variation in "weight". This may be true, but it might not. Since correlation doesn't mean causation, you could run into trouble here. 2) The other thing is about using non-standard terminology that is the reverse of what most people use - this can lead to confusion and unexpected consequences. So you could run into trouble here as well.
Super useful as always. Please continue with the videos (for example, prediction interval vs. confidence interval or maybe p-values vs. randomization tests or logistic regression...)! I liked your explanation because it never occurred to me that R^2 was basically the same as calculating percent change (diff/original)x100.
NOTE: When I first made this video, I was thinking about how R-squared relates to Linear Regression, which will not fit a line worse than the mean of the y-axis values. This is because if the values along the x-axis are truly useless in terms of predicting y-axis values, then the slope of the line used to make predictions will be 0, and the intercept will equal the mean. However, it is possible to simply draw a line that fits the data worse than the mean and get a negative R^2.
Support StatQuest by buying my books The StatQuest Illustrated Guide to Machine Learning, The StatQuest Illustrated Guide to Neural Networks and AI, or a Study Guide or Merch!!! statquest.org/statquest-store/
With enough variables in the data set, it would be easy to create a set of r-squared values so that the cumulative percent "explained by" the different variables goes over 100%. That's why I was never a fan of that terminology. Students think it implies causation when it doesn't. Otherwise, great video.
@@mattkilgore7323 Maybe I should have made it more clear, but if you have a large model with a lot of variables, then you don't add together a bunch of individual R-squared values to find the total R-squared. You calculate a single r-squared value fro the entire model. In other words r-squared refers to the models, not the individual variables.
StatQuest with Josh Starmer If you only consider all unbiased lines, (mean of predicted ys equal mean of real ys), then no negative R^2.
@@mattkilgore7323 Hi Matt can you explain the point you trying to make in a bit more detailed manner
The phrase "explained by" can be deceptive, as students often think it means "caused by." But this is not what it means in the context of r-squared. Does that help?
So glad this channel exists. It's rare that RUclips videos on stats are this well done
Thanks!
@@statquest You are a lifesaver. I am surprised you dont have more subsciptions. I would recommend your channel to my colleagues, thank you so much :)
@@shashankkhare1023 Thank you very much!!! Recommending my channel to your colleagues is the best complement you can give me. :)
you can watch and learn from Dr. Ami Gates. Her videos are great..
Thank you so much for making this sooooooo clear, I've struggled to understand the meaning of R2 for a week and you just made it clear to me in 10 min.
Bam! :)
You have explained the concept so neatly,clearly ( most importantly in an easier manner ) so that one could get deeper understanding of the concept, a fact that lot many text books / videos / articles failed to do. Keep making such videos !
I started following you 4 months ago, now I'm starting over from the very first video, I'll watch them all and understand everything.
Thank you very much for this content.
Thanks!
I can't believe the simple relationship between R^2 and R was never made clear to me! Amazing as always!
Awesome!!!! Thank you very much.
I also appreciated his comments on the subject, and him sharing his opinions and intuitions.
Just a quick question?
Crazy when you consider they could have just explained it in terms of basic maths. What's X^2 well. . . It's just x*x. It's frustrating when instructors don't teach the basics assuming everyone gets it off rip.
I read a lot on R square from different books and articles but this was the really different and very intuitive approach. Visualization is the best way to understand statistics and I think most books lack there.
Thanks! :)
Your channel is blessing in disguise. Visual aids and the explanations are so smooth and easy to understand. Thank you very much.
Thank you very much! :)
Your videos are so easy to understand, and also explains the intuition behind. I really love the way you start the video, unlike other bouring lectures.
Thank you so much! :)
Just recently found your channel. These are by FAR the most straight forward explanations I found so far. You sir are a godsend.
Thanks!
I had stats exam coming up and didn't know this particularly well, Thanks for making it much more simpler!
Good luck on the exam! :)
"sniff/weight relationship" debunked by StatQuest. Give this man a Nobel. :)
Thanks! :)
This channel has become my go-to resource for anything stat related.
Bam! :)
@@statquest Love your Bam and your singing
No one could make me understand R Squared in such easy way. Watched many videos. All made it complicated. Thanks.
Hooray!!! I'm glad to hear the video was helpful! :)
The introductions are the cutest thing I have ever seen - the videos are also super duper helpful!
Thank you! :)
Josh you are the best!!! Your every video has been helpful to god knows how many times in my studies. Much much love
Thank you very much! :)
I would rather name this video VERY CLEARLY EXPLAINED. Thank you.
Thank you!
I did not see anyone explain the statistics better than you
God bless you ...
Thank you!
So all this time I spent sniffing rocks to grow bigger was for nothing???
Ha! You made me laugh. :)
haha
only if you are mouse
I love you hahahaha
You should have made a powder out of rocks, that would speed up your growing. Especially if your powder of white color🤣🤣🤣
I have been looking at a variety of stats videos and these are clearly the best. I am so impressed with StatQuest that I renamed my four dogs, "StatQuest," "StatQuest," "StatQuest," and "John Stamos" because, of course...
BAM! :)
I have found a great channel for stats and trix.... BAM! it covers all the areas I want to learn.. Double BAM!! It's indeed clearly explained... Triple BAM!!!
Hooray!
Very beautifully explained. Many thanks to the folks of Genetics Department at the University of North Carolina at Chapel Hill.
love the way you explain things in casual manner
Thank you!
Just impeccable. I don't think any other better illustration exists other than this. Thank you
Thanks!
Could not have even imagined such intuitive explanation of this topic before watching this video. Thanks Josh!
Thank you!
cool! you have cleared all the fogs around r2 in my head once for all. appreciate your explanation!
Glad to help!
This helped me clearly understand R^2. Trying to grasp this from reading a textbook was impossible for me.
bam! :)
Added this to my useful tutorials and math playlists.
Thanks StatQuests.
BAM! :)
"time spent sniffing a rock"! had me cracking😂.... btw thanks josh for putting such great content up... this channel is the my primary source of building my statistics foundations....
Glad you like them!
I never comment, but today is not that day. Thank you so much for this!!!! I am in graduate school and am still struggling to understand these concepts, you're a life saver
Thanks!
Thank you for this easy to understand video :-)
I have two suggestion!
- Time 0:50 -- instead of `strongly related` it is better to say `strongly linear related`! We know that `R` can't explain nonlinear relationships (e.x. Y = X^2)!
- Time 10:00 -- instead of `0.7^2 = 0.5` it is better to say `0.7^2 \approx (is approximately equal to) 0.5` ;-)
Interestingly, and little known, but R^2 can be calculated for equations like y = a + b*x^2. That equation makes a curve, which is not linear, but the equation is _linear in its parameters_ (the parameters are 'a' and 'b', not 'x^2'), and that is what makes a "linear model" linear. A linear model doesn't have result in a straight line, but it must be linear in its parameters. That means you can calculate R^2 for y = a + b*x^2 or even y = a + b*sin(x). Not many people know this though since they don't understand what the "linear" in "linear models" actually refers to.
Yes, and in y = a + b*x^2 or y = a + b*sin(x) it is better to say `y` has a linear relationship with `x^2` or `sin(x)`, not `x`!
Awesome video! I needed to understand R squared for an important math essay so thank you
Glad it helped!
really boom...I was confused from past 3 days to understand regression value ...now I understand. Thanks
bam! :)
You know what i decided to start watching your videos from the beginnimg .. baaam .. thanks
Awesome! Thank you!
Now I understand the R squared much better! Thank goodness for this video!
Glad it helped!
watching this again, thank you very much. I jumped into more advanced stuff because of your videos. 🙏🙏
Awesome!
aha! we meet again, and I thank you again!!! wow, I wish you were teaching my class!!
Ha! I'm glad my videos are so helpful. :)
So proud of me because I'm watching these videos. Very very goood job thanks 😊 👍 👏
Nice!
a simple concept explained simply. thank you for the straight forward explanation
Thank you very much! :)
Wonderful explanation again. I easily understood the concept. I'm grateful.
Thanks! :)
Was struggling to understand this concept but this video explained everything!
Hooray! :)
You are a rarity ❤️ really love how you explain statistics! Please tell us more 🙏🏻♥️♥️♥️
Thanks! :)
Adding to my previous comment , R2 value can be negative when the variance explained by the line is lesser than the variance explained by mean.
For example var(mean) = 30 and var(line) = 40
Then R2 = -0.3
There exists such models , perhaps that could be worst models.
This is technically correct, but practically speaking, R-squared is always positive because it is used to compare the least squares residuals for the best fitting model to the least squares residuals for the mean, and the best fitting model can't have larger residuals than the mean, otherwise the best fitting model would be the mean. Does that make sense?
Completely agree with you in terms of practicality. It doesn't make sense at all. At the end of day you want a model which performs better than the base model. My point was it can be negative. Nevertheless i really like your videos. That comment of mine was just to clarify my understanding and to reach out to you.
I was thinking more about the negative R-squared and how it could be used in practice. I mean, like you said, even if your model is terrible, worse than the mean, it still might be nice quantify how terrible it is - and that's where the negative R-squared could come in handy. It still has the same meaning, except now you're quantifying how much worse your model is than the mean. Interestingly, it still works out even if var(terrible model) is so bad that the R-squared is less than -1. For example, if var(mean) = 50 and var(terrible model) = 100, then R-squared = (50 - 100) / 50 = -1, so "terrible model" is 100% worse than the mean. If var(terrible model) = 150, then R-squared = (50 - 150) / 50 = -2, and now terrible model is 200% worse.
Right , That's my point. From my own experience , I used to train multiple models on a sample dataset and compute their respected R-squared value to choose the best among those models. There I encountered some models returning negative R-squared value. Those models are practically useless and if you agree that happens when your training data is so huge and the algorithm you are using is so insignificant, like using a multi variant regression for a heavily skewed target variable.That was the motivation behind my comment. I appreciate your time to reply back to my comments. I am glad that it grabbed your attention Mr. Josh.
@@statquest I asked a question about this too and I assumed you meant the best fitting line (even though it was not explicitly stated in the video), or at least one that performed better than the mean line.
Thanks, every question/doubt that I had instantly got answered about 10 seconds later.
BAM! :)
Ah! this is the best video explaining R squared! Thank a lot!
Thank you! :)
Working on my MPA stats final and this video has been so helpful
bam! :)
These videos are pretty cool. I can always come back and refresh concepts.
Glad you like them!
Great teachers make everything interesting! Thanks Josh
Thank you! :)
my dude I understood and I am happy
8-year-old video is this good
liked, subbed and thank you!
My dude! Thank you very much! :)
Thank you very much, your video is very easy way to understand, makes me want to go through the Statistics course again.
You can do it!
this is the best channel ever that can exist about statistics :D wonderful explanation and illustrations and the music! :) am glad I found this at the right time !
Thank you so much 😀
Are u glad u found it or are u asking us if u are glad??😆😆
After watching your videos, I aced my stats module!
TRIPLE BAM!!! Congratulations!!! :)
Really good explanation as to why r squared is significant in describing variation in data. Thank you!
Glad you liked it!
god bless, i have been searching high and low for this kind of video. Thank you!!!!
Hooray! :)
I've stared binge watching the entire channel :D
BAM! :)
@@statquest Do you know someone doing TDA (topological data analysis) / AYASDI Software?
Man it's a good stuff...
@@unlearningcommunism4742 I'll look into it.
@@statquest Obscure stuff, my postdoc actually, but it makes everything - better. I've personally tested a lot of things with it from images, to soil samples, languages, FTIR spectra... And it gives the edge every time.
Amazing explanation!!Made it very simply for me to understand!! :)
I went through so much content for this..thank you
Hooray! I'm glad the video was helpful.
Really enjoying your videos. Moreover everything is crystal clear and I am able to understand them. Double BAM
Hooray!!! That's great news. BAM! :)
Came here from the Pearson's correlation video. Thank you so much for this
I just wish that you could show in the video:
• how (Var(mean)-Var(line)) / Var(mean) is equal to [Covar(x,y) / (Var(x)^-2)(Var(y)^-2)]^2
• whether (Var(mean)-Var(line)) / Var(mean) using mean and differences from the x-axis also yields the same value
Again, thank you for the video
I'll keep that in mind.
Thank you, this was a life-giver! Josh Starmer, you just might have become a part of something which will be big
Wow, thanks!
This video is absoluted amazing! R^2 and R finaly understood!
bam!
You know only a level of mastery can achieve this level of ease.
Thanks!
The best explaination I can find
Thanks!
The follow up on this video gives more examples on R^2 that really helped my understanding of the concept: ruclips.net/video/nk2CQITm_eo/видео.html, see around 8:03 of that video
Nice! I'm so glad you like these videos. :)
wow this is the best explanation ive ever seen, thank u!!!!!
Thank you very much! :)
Thank you for this precious material
I'm glad you like it!
(8:30) R^2 is square of R only when you are fitting a linear regression line. Apparently, the square relationship does not hold for regressions with quadratic term(s).
That is correct, because normal correlation is only defined for straight lines.
best video explaining R and R squared ever!
Thank you!
this is bizarelly useful for my exam tomorrow
Good luck! :)
hats off and thanks a lot you will make me cry. thanks once again.
Thanks! :)
Why cry mate???
Thank You Statquest your video and my knowledge of R^2 have a R^2 of 99.99
BAM! :)
Undoubtedly d best and to the point explanation. Thanks a lot
Thank you! :)
totally agree!!
I'm truly grateful for your videos!
Thank you!
6:05 explains R^2 accounts for the variation of relationships
You goddamm beautiful man, Im eating your videos like candy nowadays, Im finishing an electrical and comms engineering degree and working with some computer science and I usually get hammered with statistical questions when I finish presenting my models, thanks to your uploads i've held my own against some nasty expert old timers, thank you for this.
Awesome! TRIPLE BAM! :)
Zedstatistics coupled with statQuest is just absolutely magnifique
Awesome! :)
这样的创作者请给我来一百个!thank you for your videos!I really appreciate what you have done, and look forward to seeing more of them~
Thank you very much! :)
[6:24] "time spent sniffing a rock" xD i love this content
bam! :)
Beautiful explanation :)
Thank you! :)
Stat quest
Stat quest
Stat quest
Statquest
Triple bam!!! :)
Thanks for sharing, Sir. It helps me a lot
Glad to hear that!
Dude..this friggin rocks.. THAKS YOU!!!!!
Thanks!
Legen"wait-for-it and yeah this guy make statistics feel so easy"dary
bam! :)
woah thanks for this one too Josh! finally gets R2
Triple bam! :)
Hi Josh,
Is there any videos that explains the Degrees of Freedom? I find it difficult to understand this concept. Pls provide link if there is.
Not yet. It's something I would love to do as soon as possible.
This is sooo good... wish i found this video earlier.
Glad you liked it!
Good explanation and your videos really are funny... good job
Hooray! I'm glad you like the videos and my silly jokes. ;)
Very good explanation, thanks!
Thank you! :)
Mr StatQuest I love you
Wow is this your login name?
Thank you so much, I've learned so much from you in the past week! Very grateful
Thank you very much! :)
Beutifully explained. Thank you so much.
Thank you!
B - E - A - utiful explanation. Thanks
Thank you!!! :)
One of best videos, thanks
Wow, thanks!
Gosh, this was SO helpful.
Thanks!
Great videos! What if you get a model that has an ok correlation (with a very significant P-value) but a low R^2? Can this still be meaningful? IE. There is a statistically significant relationship between the two variables, however there are probably other correlated variables as well so this model is not good for making accurate predictions, but it does let us know these two variables are correlated which can still be useful?
I'm not sure I understand your question because you can't have an OK correlation but a low R^2. The two are linked (the square root of R^2 is the correlation).
@@statquest I think this is what I was trying to get at maybe?
blog.minitab.com/en/adventures-in-statistics-2/how-to-interpret-a-regression-model-with-low-r-squared-and-low-p-values. Perhaps I have correlation and regression mixed up. I thought they were the same thing kind of. I'm trying to see if we can potentially have a model that is a poor predictor because there is tons of variation in the data but there is still a significant relationship between the dependent and independent variables? And this can still be useful?
@@scottcooke5641 Yes, you can definitely have a small p-value and a small r^2 value - if you have enough data, that's what happens. So having a small p-value is not enough to say something is interesting or biologically relevant. You need to have a small p-value and a relatively large r^2 for the result to be interesting. I talk about this is one of my other videos but I can't remember which one. That said, if you want to learn about regression, check this out: ruclips.net/video/nk2CQITm_eo/видео.html
Hi Josh Starmer Thanks a lot for the explaination. By eye we can see that variation around the mean is higher than variation around the blue line. @ 6 : 15 it is mentioned that size - weigth relationship accounts for the 81% of total variation in the data. However, i feel its the otherway. Variation around the mean contributes higher percentage to the total variation in the data which is 81 % that has been reduced by considering variation around the mean. This means to say that size-wieight relationship contributes to 19 % variation in the data. I have used the word "contributes" instead of "accounts for". Please correct me If am wrong.
I think there are two potential problems with your alternative phrasing. 1) To say "height contributes 19% of the variation in weight", or even " the height-weight relationship contributes 19% of the variation in the data" is to suggest that "height" causes 19% of the variation in "weight". This may be true, but it might not. Since correlation doesn't mean causation, you could run into trouble here.
2) The other thing is about using non-standard terminology that is the reverse of what most people use - this can lead to confusion and unexpected consequences. So you could run into trouble here as well.
Thanks for the simple explanation. Much appreciated.
Thanks! :)
phew...!!! finally this concept is clear in my head :) thank you sooo much
Glad it helped!
Super useful as always. Please continue with the videos (for example, prediction interval vs. confidence interval or maybe p-values vs. randomization tests or logistic regression...)! I liked your explanation because it never occurred to me that R^2 was basically the same as calculating percent change (diff/original)x100.
Thank you, you're helping us with these videos.
Thank you very much! :)
Very clear explanation!
Glad you liked it!