Just here to tell you that this video is going to EXPLODE! Maybe not immediately, but eventually almost surely :). Keep at it, awesome visuals. Small note: there is a bit of an echo, which can be fixed by putting some padding/blankets on the walls/ceiling.
I hope you’re right! I’m trying not to get ahead of myself here, but, especially with your words, I’m hopeful. Also, thanks for the echo advice. I just ordered some surrounding curtains which should help. I’ve already shot several videos without them, but eventually the echo will be gone. Thanks again - it’s always nice to hear from you.
this video is so underrated. the way you explained it alongside the provided demonstration makes it easily a top 5 between all tutorials regarding linear regression
AFAIK the LSE was developed by Gauss to estimate the parameters of orbits of comets, and, if the errors of observations have the normal distribution (which is sometimes named after Gauss for a good reason: IIRC, he researched the distribution to solve this very problem), the LSE estimate is actually the maximal likelihood estimate as well, and that was the original reasoning behind this method, not the computational feasibility per se. A damn good explanation tho, thank you!
I hope your channel blows up. This is a clear, concise discussion that keeps things on point without pulling too many punches. Great presentation. Can I request random effects next? ;-)
Thank you! I sure hope so too :) And hell yea I’m going to do random effects. May take a little while to get to, but those hierarchical models are def on the list.
As already said by some of you guys, visuals were really great this time! Never seen such least squared errors, nor these basis functions :-) Appreciate it a lot! So even though linear regression is well known, it still can be fun to learn new things about it ;-)
Actually, it is not just for ease of solution that we minimize the squared error. It corresponds to the often reasonable assumption that the noise is Gaussian. And minimizing absolute differences corresponds to Laplace-distributed noise.
Hi! Thank you a lot for your contents! It’s a pleasure to watch ever for a non math guy. The issue during model fitting is indeed understand the data to model. Seem easy in 2D or 3D but real data are a completely different story… I hope to improve my regression skills!
In case you're wondering if anyone laughed at the 0:19 joke, I definitely laughed. Dunno if it qualified as a rare honest use of the term "LOL", but it was a clearly audible guffaw, so, a CAG I guess.
Do you use the same package as 3B1B? This channel has that feel. Also, you should consider starting a Patreon. Your content is quite good, and I could see it garnering a considerable following. There is such a huge stats/ML community out there that is lacking a 3B1B level of content contribution...this is a huge opportunity. Thanks for publishing!
Thank you! I do not using Manim (but eventually I should explore it). Instead, I use a personal library that heavily uses the Python plotting library Altair. And I hope you’re right. It would be nice to grow, but I’m not necessarily optimizing for that directly. I think there is a trade off between the size of audience and level of technical detail, and I’d like to keep a high level of that detail. But we’ll see! And I actually do have a Patreon ( patreon.com/MutualInformation )but I haven’t advertised it, just because the channel is so small and I haven’t figured out how I’d offer extra content to the patrons. But it’s certainly in the works. Thanks again for your advice and appreciation. Really helps out a lot, especially when I’m just starting.
Wow! I never realized linear regression is only about linearity in beta! With your expertise also in exponential families, I'd love to see you make a video about GLM! I still don't understand why GLM needs its errors to be distributed via an exponential family - maybe you'll make it clear!
Ah yes! GLM is right there on my list. I have some RL stuff to finish up but I want to get to that. That and causal inference - lot's of cool shit in the pipeline
@@Mutual_Information excited to see these topics! Would love to hear your explanations on connections between Bayesian networks, counter factual and linear regression.
What's the relationship between using a basic function and doing what I've been taught is called a linearization? For example, for an exponential, you apply the natural log to your data then fit a linear model. Also, does using either method mess with the assumptions of ols? I've read that doing a linearization is worse than iteratively fitting a non-linear model, because you can't assume a normal distribution of errors around your data after the transformation.
Hm, I'm not sure what the term "linearization" here means. It could mean what I showed, where you transform x into f(x) and then regress y on f(x). In that case, the major assumptions aren't violated since most lin-reg assumption have to do with the distribution of y given x.. and they don't say anything about how x is distributed (though, you need whatever you regress on to be linear independent- I guess that's an assumption). It sounds like what you might be referring to is a transformation on y to say g(y) and then regressing g(y) on x (or f(x)). That can sometimes certainly be a bad idea. Lin-reg wants p(y|x) to be approximately normal. If p(y|x) is approximately normal, then p(g(y)|x) may not be (or visa versa)! In general, in a case where you want to transform y.. I would say either 1) use generalized linear models - they are designed exactly for this and they do as you suggest - use an iterative procedure. or 2) be sure that p(g(y)|x) looks normal.
Just here to tell you that this video is going to EXPLODE! Maybe not immediately, but eventually almost surely :). Keep at it, awesome visuals. Small note: there is a bit of an echo, which can be fixed by putting some padding/blankets on the walls/ceiling.
I hope you’re right! I’m trying not to get ahead of myself here, but, especially with your words, I’m hopeful.
Also, thanks for the echo advice. I just ordered some surrounding curtains which should help. I’ve already shot several videos without them, but eventually the echo will be gone.
Thanks again - it’s always nice to hear from you.
this video is so underrated. the way you explained it alongside the provided demonstration makes it easily a top 5 between all tutorials regarding linear regression
Thank you - it's one of my less appreciate ones, but you're changing that
I'm a student of economics and will be very helped by your content. Thank you!
AFAIK the LSE was developed by Gauss to estimate the parameters of orbits of comets, and, if the errors of observations have the normal distribution (which is sometimes named after Gauss for a good reason: IIRC, he researched the distribution to solve this very problem), the LSE estimate is actually the maximal likelihood estimate as well, and that was the original reasoning behind this method, not the computational feasibility per se. A damn good explanation tho, thank you!
Yea, good point - I didn't research the truly original use of OLS, but I've heard this story. I'm doing thinking in a familiar, modern context
I was not sure how to comment... as I can hardly find the words to express how good this explanation is. Thanks a lot!
I think you expressed it well - glad you enjoyed! More coming
Going back to review your earlier stuff. It's good to see it was quality also from early on.
Bro, the visualizations in the last half of the video were fantastic. Amazing work man. Keep it up!
Thank you my man. And love the user name ;)
I hope your channel blows up. This is a clear, concise discussion that keeps things on point without pulling too many punches. Great presentation.
Can I request random effects next? ;-)
Thank you! I sure hope so too :)
And hell yea I’m going to do random effects. May take a little while to get to, but those hierarchical models are def on the list.
Honestly, I feel a little privileged having found a good-content channel this early. I hope to see it grow too 😊
@define SIGINT I echo this. I have yet to find a good resource on mixed models...just "pretty good" ones.
perfect explanation and visualization! thank you a lot making this
Man i love this channel
As already said by some of you guys, visuals were really great this time! Never seen such least squared errors, nor these basis functions :-) Appreciate it a lot!
So even though linear regression is well known, it still can be fun to learn new things about it ;-)
Really appreciate it. My goal was to introduce the idea of a basis expansion for the bernstein basis video, and that's a little less known.
That was some high quality explanation you managed to put in there! The math seemed a little fast but hey, we can always rewind and watch. Cheers!
Actually, it is not just for ease of solution that we minimize the squared error. It corresponds to the often reasonable assumption that the noise is Gaussian. And minimizing absolute differences corresponds to Laplace-distributed noise.
Hi! Thank you a lot for your contents! It’s a pleasure to watch ever for a non math guy. The issue during model fitting is indeed understand the data to model. Seem easy in 2D or 3D but real data are a completely different story… I hope to improve my regression skills!
ha yea, everything is easier in these toy model cases..
"All my videos are about math. Non of them are cool". The only wrong sentence in this whole video 😁
In case you're wondering if anyone laughed at the 0:19 joke, I definitely laughed. Dunno if it qualified as a rare honest use of the term "LOL", but it was a clearly audible guffaw, so, a CAG I guess.
Haha good thing I didn’t cut it! I almost did
Same here; just sitting quietly and couldn't help laughing. Good thing I wasn't in a library; I might've been shush'd😁
This was an awesome video man 😳👌 keep it up
Will do!
your vids are cool, thanks for the effort and I love watching these
This is great channel.
Thank you Yunus. And it's not dead :)
Great video! Thank you.
linear regression makes assumption that error are gaussian distributed that's why we minimize mean square error
Do you use the same package as 3B1B? This channel has that feel. Also, you should consider starting a Patreon. Your content is quite good, and I could see it garnering a considerable following. There is such a huge stats/ML community out there that is lacking a 3B1B level of content contribution...this is a huge opportunity. Thanks for publishing!
Thank you! I do not using Manim (but eventually I should explore it). Instead, I use a personal library that heavily uses the Python plotting library Altair.
And I hope you’re right. It would be nice to grow, but I’m not necessarily optimizing for that directly. I think there is a trade off between the size of audience and level of technical detail, and I’d like to keep a high level of that detail. But we’ll see!
And I actually do have a Patreon ( patreon.com/MutualInformation )but I haven’t advertised it, just because the channel is so small and I haven’t figured out how I’d offer extra content to the patrons. But it’s certainly in the works.
Thanks again for your advice and appreciation. Really helps out a lot, especially when I’m just starting.
@@Mutual_Information I'm getting a 404 error from the link and no luck direct searching.
Oh oops, parenthesis got caught in there. Should be good to know. Thanks!
Thanks for asking! The question about Manim was exactly my question too.
Ah nice to hear from you Letitia!
Amazing video!
So many new insights here. This explanation connected so many dots.
Is this leading to Gaussian Processes?
That’s crazy you mention that. This video wasn’t intended for that, but GPs are my next vid. It’s coming out in about 2 weeks.
Wow! I never realized linear regression is only about linearity in beta! With your expertise also in exponential families, I'd love to see you make a video about GLM! I still don't understand why GLM needs its errors to be distributed via an exponential family - maybe you'll make it clear!
Ah yes! GLM is right there on my list. I have some RL stuff to finish up but I want to get to that. That and causal inference - lot's of cool shit in the pipeline
@@Mutual_Information excited to see these topics! Would love to hear your explanations on connections between Bayesian networks, counter factual and linear regression.
What's the relationship between using a basic function and doing what I've been taught is called a linearization? For example, for an exponential, you apply the natural log to your data then fit a linear model. Also, does using either method mess with the assumptions of ols? I've read that doing a linearization is worse than iteratively fitting a non-linear model, because you can't assume a normal distribution of errors around your data after the transformation.
Hm, I'm not sure what the term "linearization" here means. It could mean what I showed, where you transform x into f(x) and then regress y on f(x). In that case, the major assumptions aren't violated since most lin-reg assumption have to do with the distribution of y given x.. and they don't say anything about how x is distributed (though, you need whatever you regress on to be linear independent- I guess that's an assumption).
It sounds like what you might be referring to is a transformation on y to say g(y) and then regressing g(y) on x (or f(x)). That can sometimes certainly be a bad idea. Lin-reg wants p(y|x) to be approximately normal. If p(y|x) is approximately normal, then p(g(y)|x) may not be (or visa versa)! In general, in a case where you want to transform y.. I would say either 1) use generalized linear models - they are designed exactly for this and they do as you suggest - use an iterative procedure. or 2) be sure that p(g(y)|x) looks normal.
haha. came for the math, stayed for the jokes.