Sum of squared residuals also called the sum of squared Errors is SSE and Sum of squared Regression is SSR just make sure about this since new students can get confused. Y = individual data points, Yreg = predicted Regression points Ymean = Average of Individual data points SSE = Y - Yreg SSR = Yreg - Ymean so, SST = SSE + SSR = Y - Ymean
Initially N=1000 R^2=0.85 p=5 (initially) adjusted R_Squared = 1 - ((1-0.85)(1000-1)/(1000-5-1)) = 0.9849 1. suppose a new non-correlated variable is added: N=1000 R^2=0.86 (suppose new R^2) p=6 (new) adjusted R_Squared = 1 - ((1-0.86)(1000-1)/(1000-6-1)) = 0.8591 2. suppose a new correlated variable is added: N=1000 R^2=0.92 (suppose new R^2) p=6 (new) adjusted R_Squared = 1 - ((1-0.92)(1000-1)/(1000-6-1)) = 0.9195 As we can notice, on adding a non-correlated predictor, the overall adjusted R_squared has decreased while it has increased on adding a correlated predictor. Hope it helps!
Hi Krish, Nicely explained. But have a query. R-square will always increase whether calculated against significant or insignificant feature. So, there is no thing that R-sq will be less for non-corelated features and more for corelated ones, like it will increase blindly. So, how can you say that R-adj will decrease when added attributes are non-corelated as R-sq will still increase, making R-adj = 1 - smaller_number ? I hope my question is bit clear. Thanks n respect sir!! (v).
I didn't get one thing that even in Adjusted R2, whether there's correlation or not is not taken into consideration. So, by just considering number of variables, how correlation issue gets addressed?
Nayek sir p is total independent features or those independent features which we have added later? Also, can we say that N is total number of columns in the data set? is so then, should we count those columns also which have irrelevant data like ticket serial number or passenger name in titanic dataset?
Good morning sir. Please do upload a video with explanation of what exactly is p-value. Getting confused with it. I hope atleast your explanation would give more clarity.
Krish R-square will increase in both of the cases whether the variable is correlated with dependent variable or not. hence it result in decrease in Adj R-Squarein both of the case. However the magnitute will be difference.
Hi Krish, At the end of your each sentence while explanation please make the same rhythm of the speech. What happen here is at the end of your sentence you make your voice very low so this creates confusion while listening.
Which variable in the R^2 adjusted is equation has related to correlation. it is not R^2 and all other variable have nothing to do with correlation. Is it the ratio of (n-1)/(n-p-1)?
You said by using 1st formula that even if independent feature is not related, r^2 value increses .that was the drawback. But at 14.18 sec of video you are saying if the feature is not related then we would get smaller r^ value from 1 st formula. I got confused here. Please solve my confusion. I will be glad. Please🙌🙌🙏
No.even if the feature is not correlated to output variable,the value of r square will increase, thats why we uses the adjusted r square..if the feature is not correlated, value will decrease.... May be he said that by mistake
he meant that for the same features, if they are correlated with the target variable, you will get a higher R2 value and a smaller value if they are uncorrelated.
Hello sir, I am making a project on income and health expenses, my r-squared value comes out less than 1%. What should i interpret from this? Should i change my linear model or try other? What should i do?
you should add another feature which is correlated to the target variable. Low R-squared means that your independent feature and target variable are not correlated. You can confirm this by computing the correlation between them
In adjusted r2, their is r2 But whether the feature is correlated or not the r2 value will increase than how we are able to say something about adjusted r2
i am not 100% sure if this is correct when you say it needs to be squared (Actual - Predicted) because of negative value but i suspect its for the outliers
No bro, That will depend whether the features getting added are correlated or not. If the features getting added are not correlated with the target variable then the adjusted R square will decrease, however if they are correlated then naturally adjusted R square will also increase.
Adding multiple feature will automatically increase the r square, as increasing feature decreases the value of SSres.even if the feature is not related to the output variable. Adding multiple feature to our model can perform better in sample than when tested out of sample.So in such case adjusted r square works
good day sir, I just wanted to ask if an independent variable is not significant or does not have an explanatory power to the model but when removing it lowers the adjusted r-square what does this imply? so far the reason that i know the reason is because the t-statistic is greater than one. With this information, what can we infer?
Could you please explain with any example from scratch with multi output in regression?? I want to predict 2 output (distance travelled and velocity) from the dataset.
Sir, As you said that in order to avoid negate values in the residuals we squared the terms SSres and SStot , but sir if we apply mode on both values neglecting squared both terms , what will be the change in R values ?? On squaring the R value its getting larger which is reaching towards 1 more easily that depicts our model has fitted well . please answer sir .
Let's say I have 10 features and some R square value is calculated. Later it found that 4 of the features are uncorrolated with the target. Now 1-R2 value is not going to change and so does the adjusted R2 value. Can u correct me if I'm analyzing it wrong hoping it would follow the simple linear regression model not the lasso
Sir at last of the video you said that r^2 will never be decreasing on increase of independent features even if the that feature is not correlated , then how can you say that adjusted R^2 will decrease when R^2 is less (at 14:16) which will never be true according to the fact that R^2 will always be increasing then how can it be less It have actually confused me Plz help if anyone knows
1) If added features are correlated with target, R2 grows much fater compared to denominator term containing number of features ( p). Hence Adj. R2 also increases. 2) If added features are not correlated or less correlated with target, then R2 grows slower compared to denominator term containing number of features ( p). Hence Adj. R2 will increase a little, but will not have any significant rise.( NOTE: Adj R2 Does not decrease) That is what is called as penalized. Not allowed to grow at same rate as that of correlated features case.
This is the problem with our education system...everything is just formula based...you started off with the formula without even giving any intuition about what actually R2 and adjusted R2 mean...what does a 50% R2 tell you...formula and maths always come last...you should first make your students visualize what these terms mean without using any maths at all...once they are good with it...then you bring the formula
Hey, I didn't get the term Penalizing. In the video just before explaining Adjusted R square, it was said that "it is not Penalizing the new added features". Can someone please elaborate.
If I have 10 features and if I need to know which feature is affecting output y and which is not affecting y. Do I need to find correlation between y and each feature separately. If yes , then how? If not , then what to do? Krish please reply. Thanks
You need to perform chi square test if both IP&Op variables are categorical and ANOVA for cat ,cont variables ,finally Pearson correlation for both continuous ...!!!
you have many way to find , firstly you can find correlation between them using heatmap or corr method, secondly you an find the VIF value of the features , last way you can check your standard error by using OLS method.
What do we do next if we get to know that r-square is small ? Yeah it says the model isn't a good fit but is there any way we can improve the model after getting to know the r squared is less or we use some other method to solve this model
If these two are different then why do all say that r-sqaure and adjusted r-sqaure both are same and while seeing the ouput we always see the adjusted r-square.
R-Squared and Adj R-Squared are NOT the same. For Simple Linear Regression, the R-Squared and Adj. R-Squared values will almost be similar. You can just check the R-Squared value to evaluate your model's goodness of fit. For multiple Linear Regression, you will find that no matter what, the R-Squared value will keep increasing as you add new features (even if the new feature is not correlated to the dependent variable). This leads you to believe that the new feature (independent variable) you've added is contributing to building a better model, which is not the case. The adjusted R-Squared function provides a penalty mechanism that reduces the overall value if the new feature is not contributing to the model. This metric is usually considered to evaluate the goodness of fit (in the case of Multiple Linear Regression), especially when you're using a Feature Selection method like Step-Wise Regression.
Still not clear for me, can anyone help me out. In case of un-correlated or correlated variable, If p increases then N will also increase, R2 obviously increase, then how its penalizing?
Little Confusing for the use of Adjusted Rsquare !.. So when we add more independent variables to model, the Rsquare will always make sure to increase, then Adjusted Rsquare checks if independent variables is not correlated to the target variable and minimize Rsquare value. Does that mean while feature selection, we should take those independent features that are correlated to target/output variable and drop other..? Aren't we supposed to take those independent variables in model that are not correlated with each other and they are independent, so why penalizing them which are not correlated !! For independent variables that are correlated, we could drop them !
not a satisfactory explanation as to how R adjusted takes care of non correlated value, just hacking the formula doesnt make it very clear. The intuition and the reason for adding sample size is not explained properly. Overall not a good explanation
Correct yourself R-squared = SumSquareRegression/SumSquareTotal and this entity cannot be negative. SST = SSR + SSE. So SST > SSE , there is no chance of R-squared to be negative. This what happens when you are teaching without have good understanding of concepts behind them. You have more than 150K subscribers and do not mislead them From mathematical stand point R-square is the ratio of variation explained due to the model to variation in the data
𝑅2 compares the fit of the chosen model with that of a horizontal straight line (the null hypothesis). If the chosen model fits worse than a horizontal line, then 𝑅2 is negative. Note that 𝑅2 is not always the square of anything, so it can have a negative value without violating any rules of math. 𝑅2 is negative only when the chosen model does not follow the trend of the data, so fits worse than a horizontal line. Example: fit data to a linear regression model constrained so that the 𝑌 intercept must equal 1500 i.stack.imgur.com/CHpzE.png The model makes no sense at all given these data. It is clearly the wrong model, perhaps chosen by accident. The fit of the model (a straight line constrained to go through the point (0,1500)) is worse than the fit of a horizontal line. Thus the sum-of-squares from the model (𝑆𝑆reg) is larger than the sum-of-squares from the horizontal line (𝑆𝑆tot). 𝑅2 is computed as 1−𝑆𝑆reg𝑆𝑆tot. When 𝑆𝑆reg is greater than 𝑆𝑆tot, that equation computes a negative value for 𝑅2 . With linear regression with no constraints, 𝑅2 must be positive (or zero) and equals the square of the correlation coefficient, 𝑟. A negative 𝑅2 is only possible with linear regression when either the intercept or the slope are constrained so that the "best-fit" line (given the constraint) fits worse than a horizontal line. With nonlinear regression, the 𝑅2 can be negative whenever the best-fit model (given the chosen equation, and its constraints, if any) fits the data worse than a horizontal line. Bottom line: a negative 𝑅2 is not a mathematical impossibility or the sign of a computer bug. It simply means that the chosen model (with its constraints) fits the data really poorly.
This person has put in a great degree of time and effort which is an indication of his passion. The reason he has 150K subscribers is that the followers are able to make sense of what he is saying. And dude, logically what will he gain by misleading them. Is he preaching some religion???? I checked your RUclips channel...surprised that you are commenting without having uploaded a single video?? I recommend that first of all we learn to appreciate the person and even if there is a mistake in something he is saying(to err is human!), lets show some humility in pointing it out.
@@jagannathgirisaballa Hi I understand that you no idea about ML or stats. I dont need videos to be uploaded to comment on others videos. Anyway I have Phd in ML/Computer Vision. I dont want get into fight with you . Chill and follow his Videos.
Buddy chill...whatever I explain is based on the practical experience...so that means I have proof of everything I do. Any how u r highly qualified, I think u should share your knowledge with everyone...I would also love to see some implementations from your end..and Yes I do not mislead anyone..You can check my linkedin profile, and these videos have helped people to clear interviews. Anyhow it has not helped you, I am sorry about it. So in conclusion misleading is a very wrong term to use over here. Being a highly qualified guy like you, it doesn't suit you at all. Cheer stay safe and healthy. I would also suggest u to go through this link stats.stackexchange.com/questions/12900/when-is-r-squared-negative
@@machinelearningchefs3525 bro, I will be the first person to accept that I have no idea of ML or stats. And that's my excuse of being here and watching the video. So, bro with a PhD, whats your excuse of being here and watching the video? Checking out the opposition? :-) anyways, peace brother. I am here for learning and would love to learn from anyone..apologies if my comment hurt your feelings. not intentional.
Sum of squared residuals also called the sum of squared Errors is SSE and Sum of squared Regression is SSR just make sure about this since new students can get confused.
Y = individual data points, Yreg = predicted Regression points Ymean = Average of Individual data points
SSE = Y - Yreg
SSR = Yreg - Ymean
so,
SST = SSE + SSR
= Y - Ymean
Thank you, now understood well
@Ahmed Kellen didn't they ask money
Hey can you help me the 'N' here, is it the total number of features or the total number of data points.
@@ShashwatAgarwal007 big N is total number of population and small n is total number of samples which we take from population
Initially
N=1000
R^2=0.85
p=5 (initially)
adjusted R_Squared = 1 - ((1-0.85)(1000-1)/(1000-5-1)) = 0.9849
1. suppose a new non-correlated variable is added:
N=1000
R^2=0.86 (suppose new R^2)
p=6 (new)
adjusted R_Squared = 1 - ((1-0.86)(1000-1)/(1000-6-1)) = 0.8591
2. suppose a new correlated variable is added:
N=1000
R^2=0.92 (suppose new R^2)
p=6 (new)
adjusted R_Squared = 1 - ((1-0.92)(1000-1)/(1000-6-1)) = 0.9195
As we can notice, on adding a non-correlated predictor, the overall adjusted R_squared has decreased while it has increased on adding a correlated predictor. Hope it helps!
But it decreased from the initial adj R^2, so how we find out that new feature is correlated
best teacher of ML on the youtube
SSR means Sum of the Squares of the Residuals
SST - Sum of the Squares of the Total....
very informative and useful content, lucid explaination
I am glad I came across this tutorial. Very well explained !
It's very excellent and detailed explanation for a beginner!!!
Explained in detailed manner keep doing
Wow.. thanks so much Krish. This was the best explanation i found
Hi Krish, Nicely explained. But have a query. R-square will always increase whether calculated against significant or insignificant feature. So, there is no thing that R-sq will be less for non-corelated features and more for corelated ones, like it will increase blindly. So, how can you say that R-adj will decrease when added attributes are non-corelated as R-sq will still increase, making R-adj = 1 - smaller_number ? I hope my question is bit clear. Thanks n respect sir!! (v).
I too have this doubt
Very interesting Krish. As always you stimulate us to think and learn.
I didn't get one thing that even in Adjusted R2, whether there's correlation or not is not taken into consideration. So, by just considering number of variables, how correlation issue gets addressed?
Very intuitive explanation..!!! You have been such an inspirational instructor ..!!!!
very helpful video, thank you sir
Can you please explain how the SSres will decrease as we try to add a new independent variable?
Thanks a lot Krish 🙂its really helpful
Bahut accha somjaya sir thank you sir
Thank you so much sir for your great support by making such videos.
All time never ever found these kind explanation.
I will not follow any howle heros except Sadhguru and You.
Very interesting and excellent but requested to give examples to evaluate situations
Rsqaure meanns ssr/sst only right whay -1 before that . Just to know in some excel videos it shows only ssr/sst
Nayek sir
p is total independent features or those independent features which we have added later?
Also, can we say that N is total number of columns in the data set?
is so then, should we count those columns also which have irrelevant data like ticket serial number or passenger name in titanic dataset?
Sir, but if p will increase the N will also increase because they both have independent variables. So the denominator will always be zero.
N is the number of samples, not number of predictors. For the shape of dataframe (m,n) the number of samples is m and number of preictors is n.
beautiful explanation sirji
Sir it would be great it you can compliment this with an example
Excellent explanation.. thank u very much
Good morning sir. Please do upload a video with explanation of what exactly is p-value. Getting confused with it. I hope atleast your explanation would give more clarity.
www.wikihow.com/Calculate-P-Value
N - total sample size, indicates no of rows in the model?
Krish R-square will increase in both of the cases whether the variable is correlated with dependent variable or not. hence it result in decrease in Adj R-Squarein both of the case. However the magnitute will be difference.
Hi Krish, At the end of your each sentence while explanation please make the same rhythm of the speech. What happen here is at the end of your sentence you make your voice very low so this creates confusion while listening.
please tell why SS res decrease as we increase the feature
please explain ?
Which variable in the R^2 adjusted is equation has related to correlation. it is not R^2 and all other variable have nothing to do with correlation. Is it the ratio of (n-1)/(n-p-1)?
Even I have same question. There should be something more in the formula of R2 adjusted which will take correlation into account.
You said by using 1st formula that even if independent feature is not related, r^2 value increses .that was the drawback. But at 14.18 sec of video you are saying if the feature is not related then we would get smaller r^ value from 1 st formula. I got confused here. Please solve my confusion. I will be glad. Please🙌🙌🙏
No.even if the feature is not correlated to output variable,the value of r square will increase, thats why we uses the adjusted r square..if the feature is not correlated, value will decrease....
May be he said that by mistake
he meant that for the same features, if they are correlated with the target variable, you will get a higher R2 value and a smaller value if they are uncorrelated.
Bhai kya karke manoge , itna simply koi kaise padha sakta hai👍
Hello sir, I am making a project on income and health expenses, my r-squared value comes out less than 1%. What should i interpret from this? Should i change my linear model or try other? What should i do?
you should add another feature which is correlated to the target variable. Low R-squared means that your independent feature and target variable are not correlated. You can confirm this by computing the correlation between them
Awesome video and explaination
In adjusted r2, their is r2
But whether the feature is correlated or not the r2 value will increase than how we are able to say something about adjusted r2
Good explanation, but it would be better to add an example. That way it will become more clear :)
Please see if this could help you
ruclips.net/video/3SoK930HWL0/видео.html
i am not 100% sure if this is correct when you say it needs to be squared (Actual - Predicted) because of negative value but i suspect its for the outliers
Great explanation Thank you
Thank you sir u made the things veery easy
Well Explained
Nicely explained... Can you help me with difference between Sum of Residual and Cost function? Looks like both have same formula.
Actually both are same..sum of residual is the sum of square of difference between predicted and actual data points and cost function is also same,
@@ayushmishra-sw4po Thanks Ayush!!!
Sir SSR means sum of squares of residuals.
very well explained
Kuch samjh nhi aya
hi krish,
if we add features with high error then the SSres increases , but if we add features with low error then SSres decreases
Can you suggest good book for Machine Learning ?
What does this mean that R square will always increase when feature is added. This means when features are increased predictions are better. Is it so?
No bro, That will depend whether the features getting added are correlated or not. If the features getting added are not correlated with the target variable then the adjusted R square will decrease, however if they are correlated then naturally adjusted R square will also increase.
Adding multiple feature will automatically increase the r square, as increasing feature decreases the value of SSres.even if the feature is not related to the output variable. Adding multiple feature to our model can perform better in sample than when tested out of sample.So in such case adjusted r square works
good day sir, I just wanted to ask if an independent variable is not significant or does not have an explanatory power to the model but when removing it lowers the adjusted r-square what does this imply? so far the reason that i know the reason is because the t-statistic is greater than one. With this information, what can we infer?
Could you please explain with any example from scratch with multi output in regression?? I want to predict 2 output (distance travelled and velocity) from the dataset.
Sir, As you said that in order to avoid negate values in the residuals we squared the terms SSres and SStot , but sir if we apply mode on both values neglecting squared both terms , what will be the change in R values ?? On squaring the R value its getting larger which is reaching towards 1 more easily that depicts our model has fitted well . please answer sir .
Let's say I have 10 features and some R square value is calculated. Later it found that 4 of the features are uncorrolated with the target. Now 1-R2 value is not going to change and so does the adjusted R2 value. Can u correct me if I'm analyzing it wrong hoping it would follow the simple linear regression model not the lasso
Well done
What are these 33 dislikes for ? Is your language different :-D, Awesome explanation Krish, hats off
maybe in search of hindi content
Sir at last of the video you said that r^2 will never be decreasing on increase of independent features even if the that feature is not correlated , then how can you say that adjusted R^2 will decrease when R^2 is less (at 14:16) which will never be true according to the fact that R^2 will always be increasing then how can it be less It have actually confused me Plz help if anyone knows
Yup I also have the same problem
1) If added features are correlated with target, R2 grows much fater compared to denominator term containing number of features ( p). Hence Adj. R2 also increases.
2) If added features are not correlated or less correlated with target, then R2 grows slower compared to denominator term containing number of features ( p). Hence Adj. R2 will increase a little, but will not have any significant rise.( NOTE: Adj R2 Does not decrease) That is what is called as penalized. Not allowed to grow at same rate as that of correlated features case.
what are possible interpretations and justifications for low r square values in management science?
Wonderful Explanation !!
This is the problem with our education system...everything is just formula based...you started off with the formula without even giving any intuition about what actually R2 and adjusted R2 mean...what does a 50% R2 tell you...formula and maths always come last...you should first make your students visualize what these terms mean without using any maths at all...once they are good with it...then you bring the formula
Hey, I didn't get the term Penalizing. In the video just before explaining Adjusted R square, it was said that "it is not Penalizing the new added features". Can someone please elaborate.
Great explanation Sir!
Fantastic course!. I hope you doing well sir .
Thank you Krish that's the good explanation.
If I have 10 features and if I need to know which feature is affecting output y and which is not affecting y. Do I need to find correlation between y and each feature separately. If yes , then how? If not , then what to do? Krish please reply. Thanks
You can do Eda, do a pairplot check correlation and put on heatmap and later you can aply machine learning algo
@@deepakgehani thanks a lot. I will apply this and revert back to you in case I face any other issue. Thanks again
You need to perform chi square test if both IP&Op variables are categorical and ANOVA for cat ,cont variables ,finally Pearson correlation for both continuous ...!!!
You write in a loop all the variables and check correlation.
you have many way to find , firstly you can find correlation between them using heatmap or corr method, secondly you an find the VIF value of the features , last way you can check your standard error by using OLS method.
Thanks .. Explained beautifully
HOW U TOOK AVERAGE LINE IN GRAPH (ON WHAT BASIS?)
It's simply the arithmetic mean of target variable's "actual" values.
i just wanna know this total sample size is total number of columns or total number of rows
sample size is total number of rows. predictors are total number of columns
What do we do next if we get to know that r-square is small ? Yeah it says the model isn't a good fit but is there any way we can improve the model after getting to know the r squared is less or we use some other method to solve this model
Hyperparameter tuning
very useful video
How we can say adj r square is significant or not
Sir, what is the meaning of penalize in terms of machine learning?
Here Panalize means er are adding extra predictor which is no use..so it will decrease the value of Adjusted R sq
@@ayushmishra-sw4po thank you so much
Thank you sir🙏
Can R square be considered as training accuracy?
yes, it is a performance metric. in practice, adjusted r-score is used more often
Since R Square is the squared value of r, then how it will get a negative value.
R square always 0 to 1. It will never ever be a negative number
There is no such value of R, only R Square is the terminology used for this formula. Check out the formula for R Square.
R is the Correlation Coefficient
R squared can be a negative value if the model is worse than average best fit line.
what is the meaning of penalize
Why r2 value is no decreasing when features are increasing is their any theory behind it
yes. you will always be adding either 0 or small values > 0 (because of the square) so it will either remain the same or increase.
Hi krish can u please suggest how to explain the algorithm in interview
are they ask algorithm in interview
@@bhavyaparikh6933 yes
If these two are different then why do all say that r-sqaure and adjusted r-sqaure both are same and while seeing the ouput we always see the adjusted r-square.
R-Squared and Adj R-Squared are NOT the same.
For Simple Linear Regression, the R-Squared and Adj. R-Squared values will almost be similar. You can just check the R-Squared value to evaluate your model's goodness of fit.
For multiple Linear Regression, you will find that no matter what, the R-Squared value will keep increasing as you add new features (even if the new feature is not correlated to the dependent variable). This leads you to believe that the new feature (independent variable) you've added is contributing to building a better model, which is not the case. The adjusted R-Squared function provides a penalty mechanism that reduces the overall value if the new feature is not contributing to the model. This metric is usually considered to evaluate the goodness of fit (in the case of Multiple Linear Regression), especially when you're using a Feature Selection method like Step-Wise Regression.
Still not clear for me, can anyone help me out.
In case of un-correlated or correlated variable, If p increases then N will also increase, R2 obviously increase, then how its penalizing?
N is constant here because it's number of samples vs p is number of preictors.
Awesome
thank you so much...It helped
In which condition, SSR will be greter than SST?
As we increase the number of independent feature the value of SSR will increase
If the model prediction is worst than the average prediction we have assumed in SST
superb
very well explained, thank you sir.
Little Confusing for the use of Adjusted Rsquare !.. So when we add more independent variables to model, the Rsquare will always make sure to increase, then Adjusted Rsquare checks if independent variables is not correlated to the target variable and minimize Rsquare value.
Does that mean while feature selection, we should take those independent features that are correlated to target/output variable and drop other..?
Aren't we supposed to take those independent variables in model that are not correlated with each other and they are independent, so why penalizing them which are not correlated !! For independent variables that are correlated, we could drop them !
kamal !!!!!
not a satisfactory explanation as to how R adjusted takes care of non correlated value, just hacking the formula doesnt make it very clear. The intuition and the reason for adding sample size is not explained properly.
Overall not a good explanation
Particular bolna kab band kroge
Correct yourself R-squared = SumSquareRegression/SumSquareTotal and this entity cannot be negative.
SST = SSR + SSE.
So SST > SSE , there is no chance of R-squared to be negative. This what happens when you are teaching without have good understanding of concepts behind them. You have more than 150K subscribers and do not mislead them
From mathematical stand point R-square is the ratio of variation explained due to the model to variation in the data
𝑅2 compares the fit of the chosen model with that of a horizontal straight line (the null hypothesis). If the chosen model fits worse than a horizontal line, then 𝑅2 is negative. Note that 𝑅2 is not always the square of anything, so it can have a negative value without violating any rules of math. 𝑅2
is negative only when the chosen model does not follow the trend of the data, so fits worse than a horizontal line.
Example: fit data to a linear regression model constrained so that the 𝑌
intercept must equal 1500
i.stack.imgur.com/CHpzE.png
The model makes no sense at all given these data. It is clearly the wrong model, perhaps chosen by accident.
The fit of the model (a straight line constrained to go through the point (0,1500)) is worse than the fit of a horizontal line. Thus the sum-of-squares from the model (𝑆𝑆reg)
is larger than the sum-of-squares from the horizontal line (𝑆𝑆tot). 𝑅2 is computed as 1−𝑆𝑆reg𝑆𝑆tot. When 𝑆𝑆reg is greater than 𝑆𝑆tot, that equation computes a negative value for 𝑅2
.
With linear regression with no constraints, 𝑅2
must be positive (or zero) and equals the square of the correlation coefficient, 𝑟. A negative 𝑅2 is only possible with linear regression when either the intercept or the slope are constrained so that the "best-fit" line (given the constraint) fits worse than a horizontal line. With nonlinear regression, the 𝑅2
can be negative whenever the best-fit model (given the chosen equation, and its constraints, if any) fits the data worse than a horizontal line.
Bottom line: a negative 𝑅2
is not a mathematical impossibility or the sign of a computer bug. It simply means that the chosen model (with its constraints) fits the data really poorly.
This person has put in a great degree of time and effort which is an indication of his passion. The reason he has 150K subscribers is that the followers are able to make sense of what he is saying. And dude, logically what will he gain by misleading them. Is he preaching some religion???? I checked your RUclips channel...surprised that you are commenting without having uploaded a single video?? I recommend that first of all we learn to appreciate the person and even if there is a mistake in something he is saying(to err is human!), lets show some humility in pointing it out.
@@jagannathgirisaballa Hi I understand that you no idea about ML or stats. I dont need videos to be uploaded to comment on others videos. Anyway I have Phd in ML/Computer Vision. I dont want get into fight with you . Chill and follow his Videos.
Buddy chill...whatever I explain is based on the practical experience...so that means I have proof of everything I do. Any how u r highly qualified, I think u should share your knowledge with everyone...I would also love to see some implementations from your end..and Yes I do not mislead anyone..You can check my linkedin profile, and these videos have helped people to clear interviews. Anyhow it has not helped you, I am sorry about it. So in conclusion misleading is a very wrong term to use over here. Being a highly qualified guy like you, it doesn't suit you at all. Cheer stay safe and healthy. I would also suggest u to go through this link
stats.stackexchange.com/questions/12900/when-is-r-squared-negative
@@machinelearningchefs3525 bro, I will be the first person to accept that I have no idea of ML or stats. And that's my excuse of being here and watching the video. So, bro with a PhD, whats your excuse of being here and watching the video? Checking out the opposition? :-) anyways, peace brother. I am here for learning and would love to learn from anyone..apologies if my comment hurt your feelings. not intentional.
Thank you Krish. Nice explanation.
Very well explained
Thankyou so much sir
Thanks...very well explained.