R-squared, Clearly Explained!!!
HTML-код
- Опубликовано: 17 ноя 2022
- R-squared is one of the most useful metrics for understanding how two quantitate things, like weight and height, are related.
If you'd like to support StatQuest, please consider...
Patreon: / statquest
...or...
RUclips Membership: / @statquest
...buy my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
statquest.org/statquest-store/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
#StatQuest
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
👍
Hi, Josh! I just wanted to say thank you for these videos! The way you explain concepts has been honestly life changing for me (in terms of my academic career). Concepts that I've struggled with for years are finally becoming clear. I just wanted to take a moment to express my appreciation, and let you know how impactful these videos are!
@@DrOats22 Thank you very much! :)
This is such a breath of fresh air as opposed to the unecessarily difficult 'explanations' we have to work with in statistical analysis courses. Your videos are awesome.
Wow, thank you!
You're videos are the single greatest resource for my education on machine learning and AI. If I lost access to your videos, I would be devastated.
Glad you like them!
Yes!! Thanks for this. You are saving grad students around the world!
Happy to help!
And former grad students who haven't touched linear regression in 25 years! :) What a great concise refresher. BAM!
Beautifully explained! Loved the “Correlations close to 0 are lame “😂
:)
one of the most well explained about R, thanks for sharing! no time wasted in this video!
Thank you!
It's INSANE how clear this is, thank you!
Thank you! :)
This is just what I was expecting from an explanation of what R-squared is. Thank you very much for making it clear and simple
Glad it was helpful!
clicked for the title, stayed for the content. thanks for this
bam!
Excellent vid & totally helped me again with my regression homework! One of the toughest challenges I have is writing and speaking Regression! One of your last slides around 10:29 helped me learn how to connect a positive / negative variable relationship with R2...love you guys, seriously!
Glad it was helpful!
Thank you so much!!! You explain these concepts so easily!! Saving lives one video at a time 😁💕
Thank you!!! :)
Your videos are the most helpful and easiest to follow!
Glad you like them!
Incredible explainations. I'm so glad I found this chanel/book!
Thank you!
Josh, I'm literally teaching my students this today! Going to refer them to this video.
BAM! Avery, I'm glad this is helpful. This is actually the first StatQuest I ever made, back in the day. I had to re-upload it yesterday due to some oddness on behalf of RUclips, but it's still a classic and the video that got the whole thing started.
When I saw "is the mean wweight the best way to predit mouse weight", I thought, "it is stupid". And then when I see the formula of R-square, I found that "I was stupid". Awesome videos and it really helps.
bam!
Thank you so much for explaining everything in easier way !
Thanks!
mind blown. amazingly well explained thank you!
Thank you!
Thank you for this video! I have a much better understanding now
Glad it was helpful!
thank you so much, subscribing right now!
Thank you!
This was wonderful. Thank you so much!
Glad you enjoyed it!
All stats courses any level of education must be taught like that. Otherwise for majority of the people stats is ambiguous and difficult to understand. But feel like lecturers are saying this is time consuming, we have a lot of topics to cover and etc. Luckily we have nice RUclips channel and online documents to supplement the courses. Thanks for the great video!
Thank you very much! I appreciate it.
This is excellent. Why can't professors explain as well and clearly as you? I had a linear regression class yesterday and I had never even heard about variation before, only standard deviation. I didn't know the reason it was squared either. Thanks a lot
Thanks!
people have no idea how much of a gold this video is
Thank you! :)
Just awesome plain explanation 🎉
Thank you!
Great clear explanation! Thanks!
Glad it was helpful!
Thank you UNC-Chapel Hill for saving my life on my AP Stats test. I hope my EA is accepted.
BAM! Congratulations and good luck!
This is a good video. Funny, yet informative.
Glad you enjoyed it!
That's so intuitive! You really save my Midterm
Thanks!
Holy mother of god THANK YOU for this video, I was looking online at a bunch of websites (some paywalled) and none of them explained them as well as this video. Thank you for providing examples and explaining the how rather than the what.
😁😁
Glad I could help!
Very clear and helpful, thank you
Thanks!
Such a beautiful explanation. Thank You! :-)
You're very welcome!
Very clearly explained. Thank you
Thank you!
Thank you. Very useful.
Glad it was helpful!
very clear and concise
Thanks!
Excellent explanation. Consider this comment as 1million likes.❤❤
Thank you very much! :)
Banger intro, man
Thanks!
Thank you so much and thank you UNC Chapel Hill for enabling you to make these
bam! :)
You keep this up and I’ll have to forward my tuition to your address.
BAM! :)
Thank for repost this precious r-squared explanation. Yesterday i cant play this modul because of payment bla bla bla bla. Super thanks !
Sorry you had trouble and I hope it never, ever happens again. It was very, very frustrating from my end since I've tried to hard to make my videos free for the world.
StatQuest is the best thing to come out of UNC since MJ
TRIPLE BAM! :)
just beautiful!!
Thank you!
Ty for this video ! Especially at 10:30
bam! :)
this makes sm sense tysm
bam! :)
yay more new videos ☺️
:)
I can't believe this videos are fresh new. I'm sorry for everyone who had to give Statistics without watching these first
BAM! :)
Awesome!!!
Thanks!!
🤣🤣 The Intro . I'm enjoying stats thanks to you
:)
thanki you so much.
Thanks!
Starmer = Hero
Thank you! :)
Hi Sir
I am madly addicted to your WAY OF EXPLAINING
I personally owe you a lot
I love math, the way you quest it
recently I was researching on DEA as you surely know data envelopment analysis
I now, know what does it mean and how to calculate it. can even pyomo code it. use it blindly ...
but
WHAT IS THE MAIN IDEA BEHIND DEA?
Clearly Explained...
searched the web
there is no remarkable article or video etc
I was thinking if you could make such genius video
I'm glad you like my videos and I'll keep that topic in mind.
thanks bro
Any time!
Sometimes a single video is better than a whole pdf
:)
thanks for the nice explanation. I wonder what is the difference between R2 formulation the one you explained and this one --> , R2 = 1 - SSE / SST, where SSE is sum of squared errors, and SST is sum of data variance.
There is no difference. One formula can be derived directly from the other.
You are the boss
Thanks!
Ty
:)
Hi thanks for your videos! Any chance is there a statquest for adjusted R-squared?
I mention it in my video on linear regression: ruclips.net/video/nk2CQITm_eo/видео.html
Stat Quest ✊
bam! :)
Cool !!
Thanks!
DOUBLE BAM!!!
YES!
Nice
Thanks!
Thanks ! Ques: is R squared the % of y variance explained by X or explained by the model( regression equation) ?
It depends on the model. If the model only contains a single variable, X, then R-squared tells us the % of variance explained by the model, or X. Both are true. However, we can also calculate R-squared for models with many variables. For details, see: ruclips.net/video/nk2CQITm_eo/видео.html and ruclips.net/video/zITIFTsivN8/видео.html
Hi Josh, can you also explain the F test?
Sure, see: ruclips.net/video/nk2CQITm_eo/видео.html and ruclips.net/video/NF5_btOaCig/видео.html
Hi, I see a lot of your Analytics videos are repeated. Are these refreshed with new info or simply repeated?
Do I need to watch both or just the newest one?
They are the same. For some strange reason, about a year ago some of my videos got stuck behind a paywall. So I re-uploaded all of the videos behind the paywall so that they would, once again, be available to everyone for free. It now seems that whatever freak event happened back then has become undone, so now I have 2 copies of a handful of videos.
I have a question: in some cases I get an out of sample R squared which is negative, for example with multiple linear regression or even simple one-variable linear regression. Does that tell me the model is less capable of predicting the response compared to a simple mean? While in sample, there is there no difference between the R squared of a simple linear regression and the square of Person's correlation between two variables?
I'm not sure I understand what you mean by "out of sample" and "in sample", but if you are calculating R^2 using data the model was not originally fit to, then it is possible to get negative values.
@@statquest ah I see!
I meant that sometimes I would fit a model on a training set, and among the metrics to evaluate its performance on a dev/test set I would use the R squared, occasionally obtaining negative values. But I see now that it's a pretty different scope compared to the one proposed in your video, since I'm not trying to measure how related two variables are, but rather trying to evaluate a model! Thank you for your reply btw!!
Time spent sniffing a rock 🤣🤣🤣
:)
Is variance different from variation? At 2:15 we find the sum of the squared differences but we don't divide it by the number of observations - 1. Is there a reason for this?
In this case we don't need to divide by n-1 because the denominators will cancel out, leaving us with just the numerators. So we save our selves a step and omit it.
@@statquest Thank you! It's so obvious now that you pointed it out lol
BAM!
:)
r^2 = R^2 holds only for simple linear regression as I know, please correct me if i am wrong.
Yep. That's what this video was originally intended to explain - how R^2 relates to linear regression. That's why we compare the fitted straight line to a horizontal line at the mean.
@@statquest Thanks
I love Statquest videos however, this video had me confused. I tried to study R-Squared from other sources and they told me a different formula which was,
R squared = 1-(SSR/SST). Are there different kinds of R squared used in different situations?
It's the same formula, just written differently. However, you can do the algebra and show that they are equal to each other. See: en.wikipedia.org/wiki/Coefficient_of_determination
@@statquest Thanks. Thats helpful. I will try that.
10:00 explains 25% of original varaition means , 25% less variation compared to that of mean line. right?
coeffficient of correlation is square root of coefficient of determination ? 🙂
Yep, 25% less variation around the regression line than around the mean.
How did he get the var(mean) of 32 and the var(line) 32? are they just points?
Var(mean) and var(line) are numbers that are calculated by the sum of squares residuals. For example, for the var(mean), what you do is you find the difference between the mean and every point, square those, and then sune them up. In the video, this comes out to 32. Similarly, for the var(line) you find the difference between the points and the line, squaring, and summing
You can also see: ruclips.net/video/SzZ6GpcfoQY/видео.html
Please explain adjusted r square also
I describe adjusted R-squared in my video on linear regression, here: ruclips.net/video/nk2CQITm_eo/видео.html
Why is 4 months ago potato quality? Thank you so much for this.
What time point in the video, minutes and seconds, are you asking about?
@@statquest apologies, it was my attempt at humour. I'm sure it's part of your earlier series that you've re-uploaded recently. The video is fantastic in content.
If I only know the angle between the two lines, Will I be able to find the R2 value? (Like Tan theta?)
No.
Can you make a video explaining ETA squared?
I'll keep that in mind.
So there's a 6% correlation between sniffing rocks and a mouse's weight? Lol
:)
The square of correlation coefficient (i.e., predicted and true values) is equal to "R squared" only in linear regression, and not in any other regression like decision tree regressor, support vector regressor, THIS is not mentioned in the video?
That is correct. When I made this video, way back in early 2015, I only had linear regression in mind.
Does the blue line always fit better than the mean? Why or why not?
For linear regression, the blue like always fits at least as well as the mean. This is because the optimal slope value will be set to 0 if setting it to anything else makes things worse. For details, see: ruclips.net/video/nk2CQITm_eo/видео.html
Nice video, but Is var(x) supposed to be the variation or the variance?
Variation and variance are often used interchangeably and, in this case, it's OK.
💚
:)
This is great. Can I get a BAM!!! ??
bam! :)
Noice 👍 Doice 👍 Ice 👍, ....wait, is this a re-upload?
Yes. Without telling me, RUclips put the original behind a paywall, so I re-uploaded it so it would still be free.
@@statquest oofty doof oof oof, Noice 👍 Thanks 👍
This is a re-upload from 8-years ago.
Yep. For some reason the original ended up behind a paywall, so I had to re-upload it.
is this a repost Josh?
Yes. Something weird happened to the original and now it is behind a paywall. I contacted RUclips and they said there was nothing I could do about it, so I had to re-upload. Sorry for the trouble.
@@statquest In other thing.... what would you think of Statquest en Español! (pum!, the most spanish onomatopeia for bam!) I could help in the translation
@@rubenestebangarciagomez7040 I think it would be great and it's a dream of mine that I want to come true. I've even been trying to learn spanish on my own (but I'm a slow learner). For StatQuest, I've been using AI to create overdubs for my new videos and I think it is OK. If it's good enough, the cool thing is that it can be used for a ton of different languages.
@@statquest I'll try to contact you later. Even will try to sing and play ukulele intros...
mate can u update the resolution please.
Unfortunately updating old videos is a lot harder than you would expect. :(
Time spent sniffing a rock 😂😂😂
bam! :)
First! Bam
:)
why does this video only have the resolution of 360p?
It's super old, but people still watch it a lot.
How do I get access to wach some of the videos labeled "Pay to watch" such as ruclips.net/video/nk2CQITm_eo/видео.html. Do I have to become a certain level member or just pay for the video itself?
I've contacted RUclips and am trying to do everything I can to fix this problem. In the mean time, I've re-uploaded that video so that you can still watch it for free: statquest.org/video-index/ NOTE: Whenever you see a note saying you have to pay to watch a video, just scroll down to the first pinned comment and you will see a link to a free version.
I hate to be a smart ass but I think you are wrong, R^2 COULD BE NEGATIVE, a simple example is if you have a very bad regressor that way too away from all training points, then the variance could be very very large, so variance of the mean minus variance of the model could be negative, the video here is very misleading.
You are correct. However, when I made this video I was thinking of R-squared only in the context of linear regression, and in that context, R^2 can't be negative. In that context, the worst your model can do is the mean of the y-axis variable.
i'm in love with you
:)
Why on earth is this 360p
It's pretty old.
i thought this would be about topology lol
:)
Why do we Sq R? I need more explanation please :(
I'm not really sure if I understand your question. What time point, minutes and seconds, are you asking about?
@@statquest wow thanks for the quick reply lol. So when you said we sq R so the negative doesn’t cancel out the positive, could you give some examples on that?
@@tdawg6795 We square the difference between the lines and the actual values. So if the y-axis value for point A is 5 and the y-axis coordinate on the line is 3, then the difference 5-3 = 2. However, if the y-axis value for point B is 1, and the y-axis coordinate on the line is 3, then the difference, 1-3=-2. Now, if we just added those differences together, 2 + -2, we would get 0. And that would make it seem like both points, A and B, were on the line, rather than above and below it. So, instead we add up the squares: 2^2 + (-2)^2 = 8, and that makes it seem less like the points were on the line.
@@statquest thank you, I believe I’m imagining the visuals correctly in my head. Please make a video as a sort of deep dive for viewers like me who questions everything? :)
@@tdawg6795 Try this one: ruclips.net/video/SzZ6GpcfoQY/видео.html
wait, why this video has very low quality, and your voice sounds like robot all the time, and is this automatically replied comment?
This is really me, Josh Starmer, replying to your comment. This is the first "StatQuest" video I ever made. I used the built in microphone on my laptop and I recorded it in one of those "study rooms" in the library. So the sound is off and the quality isn't very high.
@@statquest wow, so thats why
@@ltd5480 yep
why are you so angry
Huh? Do I sound angry? If so, I apologize. I'm not angry. I actually have a lot of fun creating these videos.
This made no sense to me
What time point, minutes and seconds, was confusing?