Deriving the least squares estimators of the slope and intercept (simple linear regression)
HTML-код
- Опубликовано: 29 сен 2024
- I derive the least squares estimators of the slope and intercept in simple linear regression (Using summation notation, and no matrices.) I assume that the viewer has already been introduced to the linear regression model, but I do provide a brief review in the first few minutes. I assume that you have a basic knowledge of differential calculus, including the power rule and the chain rule.
If you are already familiar with the problem, and you are just looking for help with the mathematics of the derivation, the derivation starts at 3:26.
At the end of the video, I illustrate that sum(X_i-X bar)(Y_i - Y bar) = sum X_i(Y_i - Y bar) =sum Y_i(X_i - X bar) , and that sum(X_i-X bar)^2 = sum X_i(X_i - X bar).
There are, of course, a number of ways of expressing the formula for the slope estimator, and I make no attempt to list them all in this video.
Finally, someone who made it simple to understand! Thank you!
Right! i went through like a million videos trying to understand this one segment and this was the first to do it.
True.
please, how is it possible to consider Beta variable (when taking derivatives) and then consider Beta constant (to take it out of the sum) ???
Best video . i was looking for something like this
This guy literally skipped vital parts of the proof xD
Question: 6:24 Why and how beta zero hat is multiplied with n? Does n mean sample size? What's the reasoning behind n adjoining with beta zero hat?
Thank you so much.
saved my life
My god, you explained this so easily. It took me hours trying to understand this before watching this video but still couldn’t understand it properly. After watching this video, it's crystal-clear now. ❤️
me, too. I spent a whole morning figuring this. He is a savior
This is an underrated video.
@Isaiah Matias is it?
@@divitisaitezaa2599🎉😂 🎉🎉😮we🎉 are y😂 🎉😢🎉
I have gone through tons of materials on this topic and they either skip the derivation process or go direct into some esoteric matrix arithmetics. This video explains everything I need to know. Thanks.
I can't thank you enough for this brilliant explanation!
holy hell I wish you were my econometrics professor. mine is useless
I don't usually comment on teaching videos. But this really deserves thanks for how clearly and simply you explained everything. The lecture I had at the university left much to be desired
Why can we set the partial derivative to Zero?
Absolutely beautiful derivation!
Crystal clear!
Thanks very much.
Can I apply this in Mumbai University exam?
Thank you so much! This explanation is literally perfect, helped me so much!
Thanks for the kind words! I'm glad to be of help!
My Physical Chemistry teacher spent ~1.5 hrs showing this derivation and I got completely lost. Watching your video, it's so clear now. Thank you for your phenomenal work.
why was your chemistry teacher doing this lol
unbelievably perfect video, one of the best videos I have watched in the statistics field, so rare to find high-quality in this field idk why
did we consider beta not hat and beta hat as variables for partial derivation in this problem usually they are constant in straight line right ? why did we take them as variables , if any one knows the answer plse do reply me
Hi, do you have a video on deriving coefficients in multiple regression?
That is a fun derivation using linear algebra and calculus. First step is the same here which is taking the first derivative and setting it equal to zero. The book "The Elements of Statistical Learning" has a good proof. I'd say one needs a calc 1 and linear algebra background first though.
phenomenal video. Thank you for taking the time to explain each step of the derivations such as the sum rule for derivation. Thank you for helping me learn.
You are very welcome!
The best part of this video is finally figuring out where that "n" came from in the equation for beta-naught-hat. Thank you so very much for making this available.
I'm glad to be of help!
Wait, at 6:45, how do you divide the summations by n and get (y) itself? y-sub-i isn't a constant, so how does the division even work?
OOHHH NOOOO, ITS THE MEAN. NOW I GOT IT. JUST GONNA LEAVE THIS HERE JUST TO SHOW HOW STUPID I CAN BE
I'm new to this formula and the big data field, what mathematical knowledge should I learn prior to watch this video? Thank you
@@noopyx3414 Oh man, you're lucky. I just logged in to RUclips. Prior to this formula, I'd really suggest you check out Brandon Fultz's Statistics 101: Linear Regression series. There he explains what this formula and other stuff regarding the topic, are all about.
@@kaanaltug455 Thank you very much!
finally, I've understood this bloody thing. Thank u sooooo much m8.
I'm glad you found it helpful!
যদিও ভাষা বুঝিনি।তবে ম্যাথ দেখেই বুঝা যাচ্ছে 😑
যতক্ষণ বুঝতে পেরেছেন 🙂
I never thought that I could understand simple linear regression using this approach. Thank you
Glad you're back!
Thanks! Glad to be back! Just recording and editing as I type this!
This is incredible, thank you so much! :)
You are very welcome!
like I said earlier, this is very good but I am fuzzy on the nxB0 part. Can someone explain that a little more.
At 10:52 timeline, how can we switch the role of X sub i and Y sub i? Could you help explain how this happens?
In the first step, we choose to expand (Xi - Xbar) but we could have chosen to expand (Yi - Ybar) and it would follow a similar route.
please, how is it possible to consider Beta variable (when taking derivatives) and then consider Beta constant (to take it out of the sum) ???
The result represent the minimas since the original function that we were minimizing is convex and open upwards, so the only way for a critical value to exist is for it to be a minimum.
very well done. i understand, and that's miraculous
Thanks so much, this was so easy to follow and comprehend!
Excellent....haha... Excellent....
Evil laugh.
I finally understood this.
your video is great but you have told that you have discussed elsewhere why we should use square of the deviation not the absolute value but didn't mention where you have discussed it or didn't give the link of the video. this is bad.
Really, really good explanation!! Thank you!!
Amazing and super helpful video! Extremely simple and easy to follow! But please, quick question: Why did you switch the Xi and Xbar at 7:51? This drastically changes the ending solution.
When he removes the inner paranthesis, the term Xi becomes negative and Xbar becomes positive. So when you multiply it by (-ve)Beta, the signage of both terms reverses
2:24, where did you discuss why it makes sense to minimize the sum of squared residuals ?
makes it more sensitive to bigger errors. And it's differentiable at all points. In the Mod function , it is not differentiable at the point it pivots up
@@aakarshan01 but why not to power 1.5? why not to power 4? why is it exactly power 2?
@@SuperYtc1 you can.but there is no need to. The differentiability is achieved in square. Why calculate a bigger number that could lead to problems since power 4 of a decimal number of more likely to break the minimum number limit of a float than a square. But in theory, you can
Amazing. thank you!
finally got my doubt resolved.😊
Thank you very much sir !
You are very welcome!
Congratulations on being the best!
one video on youtube that actually explains something properly
thanks a lot for simplifying the derivation
In which video does he discuss why the we use squared residuals?
the simplification is the most confusing
but i got it
Amazing video! Slight bumps where my own knowledge was patchy but you provided enough steps for me to work those gaps out.
Beautiful video, good explanation
Nice trick! Adding an intelligent zero huh?
Thanks for this video!
Do you have the version for GLM?
This was really helpful thanks!
you have no idea how you saved my life, I was struggling so hard to find out why xi(xi-xbar)=(xi-xbar)^2 and etc. you are the first one I found explained that.
thank you for actually explaining it, most of videos are just like "hi, if you want to solve this, plug in this awesome formula and thats it, thank you for watching :)"
It seems like the 7:48 second part is wrong, the brackets are broken and the punctuation is changed incorrectly
No, what's in the video is correct.
thank you so much, this video has cleared all my confusions cuz the book im reading just says 'by doing some simple calculus'
Thanks alot it really helped
Thank you so much for such a clear explanation! It helps me a lot in preparing for my upcoming final exam.
Finally you cleared my doubts sir😭😭❤️🩹
And my professor couldn't
Easiest subscribe of my life.
Thanks a lot sir I really got this what I need indeed. 🙏🙏🙏🙏🙏🙏🙏🙏There is no words for appreciation of your efforts
Thank you so much am really enjoying and understanding what your teaching
at 10:43, can you please tell me why we can easily swap the roles of x and y? Is it based on any properties or formulas?
The initial term is sum (X_i - X bar)(Y_i - Y bar). While in the video I split up the (Y_i - Y bar) term, leaving (X_i - X bar) intact, I could have just as easily split up the (X_i - X bar) term instead, and using the same steps as I did in the video, end up with sum (Y_i - Y bar)X_i.
Why do we take sum of squared residuals and not only residuals and do their partial derivative wrt alpha and beta
Ouch😂😂😂 God is faithful 🙏🙏
Thank u soooo much! For explaining this. You made my day
great video, my summary just gave the formula with the text: 'just remember this' hate that
THANKS GOD FINALLY SOMEONE TRIED TO DERIVE THE FORMULA,
INSANE THAT NEARLY ALL OTHER RESOURCES OMIT THIS SHIT
least square mthod vs
10:11 I don't understand. If the term sum of (Yi-Y_bar) is 0, then whey is not the first term also 0. It is the same, just multiplyed with Xi instead of X_bar. Also I did not understand the multiply through thing. Make this clearer please in a new video.
If I made a new video to illustrate each point somebody doesn't understand, I'd have 8000 videos on just the binomial distribution. Almost all of them would be painful for most to watch, explaining how to raise a number to a power, what the binomial coefficient is, what factorials are, then a detailed discussion on why a calculator gives an error if you try to get 72!, etc.
How many hours do you think went into this? Just the 12 minutes plus upload time?
This video is very clear, but it requires some background knowledge on summation basics that you do not seem to have. I could spend half an hour discussing in painstaking detail why we can take X bar out front, but I assume that people are comfortable with the notion that multiplicative constants can come out front of the summation, and, as I say in the video "X bar is a constant with respect to the summation." Why? It's not changing as i changes (I don't see a subscript i on it). Multiplicative values that change as i changes, such as X_i, cannot come out front of course. That is a fundamentally different situation; one thing is changing as i changes, the other is not. Those are basics of working with summations.
Since X_i cannot come out front of the summation, there is no reason to believe that sum (X_i - X bar)X_i equals 0, even though sum (X_i - X bar) does. This is straightforward to show for oneself with a very simple example. Suppose n = 2 and X_1 = 4, X_2 = 6. Then X bar = 5, sum (X_i - X bar) = -1 + 1 = 0, and sum (X_i - X bar)X_i = (-1)*4 + 1*6 = 2. We don't typically have to show an example like that though, as there is no reason to believe it's 0 in the first place. And since it's equivalent to sum (X_i - X bar)^2, it'll only be 0 if all the X values are equal.
(a+b)^2 = (a+b)(a+b) = (a+b)a + (a+b)b. This isn't an esoteric notion. At some point, the learner has to fight for themselves to understand the things they don't grasp at first.
A derivation like this only makes sense if one is comfortable with the basics of working with summations. If I always explain every bit of background required, every video would be very long and just terrible.
@@jbstatistics OK, maybe I can work it out with help from your last comment. I'll try. Thanks.
Wow many university lecturers can’t explain it this well!
Very helpful video to understand. Many thanks!
thank you 100^100 times
Where can i learn more about how do summations work? You talked about so many rules of summations which i wasnt familiar with . Can you explain where i can find such rules?
They don't teach summations enough in school in my opinion. The biggest thing to know is that summations are linear operators, meaning Sum(ax+by)= asum(x)+bsum(y) where a and b are constants and x and y are being summed by the summation.
@@mattstats399 is there some book Or website where I can see examples of advanced summations?
@@Dupamine Not really. The best way is to do more problems that deal with summations. My background is statistics, so areas like mean, variance, and maximum likelihood are full of them.
@@mattstats399 I am studying some advanced statistics and I find such complex summations such as summations in elementary symmeteic functions where you sum over different patterns of things and also multiply something. Or summations where j is not equal to k. And this is nothing. I've seen summations with like so many things written beneath it. I don't know how to handle these summations
I love jbstatitics.
LEGEND, HAVE TO SAY YOU ARE BETTER THAN A PROFFESOR
I *am* a professor!!!
thank you so much
Great tutorial! I could use some ideas of how to better teach the material.
god bless you brother
Thanks a lot!!!
Good video Thanks!
How do we do this if we have three or more unknown parameters?
Thank you very much. This video helped me a lot.
You sir are AMazing
you make it sooo easy
How can we find the intercept and slope value of B0 and B1
You are awesome! I am not a native speaker and still struggling with the master program courses in the US, but your instruction is so helpful. I appreciate your great help
Thanks! I'm happy to be of help!
Thank you so much
10:29 why is the sum of the deviations is always 0 ?
When we split Sum(Yi - Ybar) into Sum(Yi) - Sum (Ybar), we get Sum (Yi) - n*Ybar. Then we know from descriptive statistics that Ybar = Sum(Yi)/n, therefore Sum(Yi) = n*Ybar, if we substitute Sum(Yi) in the Sum (Yi) - n*Ybar, we get n*Ybar - n*Ybar = 0
Thank you very to clear explanation ❤
good explanation!
Thanks for the video. Just wondering why x and y can be considered constant when differentiate against B0 or B1? Is it because of partial differentiation or X and Y are known numbers?
I think you are arguing why Bo hat and B1 hat should be considered constants for the sample.They are clearly not going to change for that sample.
Aku gangerti soal nya bu nur, jawab song
Great job! Thank you sir!
Thank you very much! Very clear and interesting explanation!
Thank you for explaining in such details ❤️
thank u so much.
Awesome video sir! Thank you!
Excellent video
Thank you
You made it simpler than my lecturer do. Thank you!
Yeah iiiight thx G
Best explanation, thank you so much