Gradient Descent, Step-by-Step
HTML-код
- Опубликовано: 25 июн 2024
- Gradient Descent is the workhorse behind most of Machine Learning. When you fit a machine learning method to a training dataset, you're probably using Gradient Descent. It can optimize parameters in a wide variety of settings. Since it's so fundamental to Machine Learning, I decided to make a "step-by-step" video that shows you exactly how it works.
NOTE: This video assumes you are already familiar with Least Squares and Linear Regression. If not, here's the link to the Quest: • The Main Ideas of Fitt...
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
Sources:
There are a ton of websites that describe the math behind Gradient Descent. One of my favorite is the wikipedia article: en.wikipedia.org/wiki/Gradien...
If you'd like to support StatQuest, please consider...
Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
RUclips Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
1:25 Main ideas behind Gradient Descent
5:38 Gradient Descent optimization of a single variable, part 1
9:08 An important note about why we use Gradient Descent
9:40 Gradient Descent optimization of a single variable, part 2
14:48 Review of concepts covered so far
15:48 Gradient Descent optimization of two (or more) variables
21:55 A note about Loss Functions
22:13 Gradient Descent algorithm
23:06 Stochastic Gradient Descent
#statquest #gradient #descent #ML
NOTE 0: If want to learn more about The Chain Rule, see: ruclips.net/video/wl1myxrtQHQ/видео.html
NOTE 1: The StatQuest Gradient Descent Study Guide is available! statquest.org/studyguides/
NOTE 2: A lot of people ask why we are using Gradient Descent to estimate the parameters in this video when we could just use least squares. We use least squares to produce a "gold standard estimate". This is the best possible estimate. We then attempt to derive the same estimate using Gradient Descent. This shows 1) how gradient descent works and 2) that the estimate is pretty good compared to the "gold standard".
NOTE 3: A lot of people ask how I found the slope value, 0.64. In the example in this video, we can compare the estimates from Gradient Descent to those that come from another method called "least squares". For specific problems, we can plug the data into the least squares formula and the output is the optimal parameters. To learn more about the specific least squares formula see: en.wikipedia.org/wiki/Least_squares
If you're wondering why, when we have least squares, would we want to use gradient descent... the answer is that least squares only works in specific situations and gradient descent can work in many more.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
StatQuest with Josh Starmer you are the best
@@emperorcyber509 Thanks! :)
it was driving me crazy, but thanks! it was an error
Thank you so much! I have a qn. It has to be SSE instead of SSR, right? SSR is (y predicted - y bar)^2, where SSE is (y pred - yi)^2
Thanks. Was looking for this. This should be pinned.
I cannot believe how many teachers get paid in schools and in universities to make students feel stupid just because they cannot explain a very important topic in a similar way like you! So much respect!
Thank you! :)
M AK that’s not the problem man. Problem is we have to pay them thousands of dollars when in reality, many people in RUclips deserves that money instead. My education wouldn’t be complete without RUclips. Thanks Josh for your videos :)
Everyone upvote this comment please. And 100% respect to Josh !
@@tilkesh I don't understand why it needs decades of hard work to explain topics in a decent way. I had teaching assistants who were totally fresh to a topic and yet they explained it way better than my professor.
and you end up studying for years without learning anything!!
" If you can't explain it simply, you don't understand it well enough" - Albert Einstein
Can't get much simpler than this explanation.
Thank you so much awesome work keep it going 👍👍👍👍👍👍👍👍👍👍👍👍
Thank you very much! :)
La vraie phrase est : "Ce qui se conçoit bien s'énonce clairement, et les mots pour le dire arrivent aisément." de Nicolas Boileau. Einstein coupable de plagiat ? ;)
i think that's from Richard Feynman. anyways I agree
there are real institutes which takes 4000-5000$ to teach you this and then comes statquest which saves your tons of money and provide u better explanation......
SUPER BAMMMMMMMM!!!!!!!!!!!!!!!!!!!!!!!!!!!
BAM! :)
@@statquest it's not just BAM, The BAM is really big. Bigger than the bam that killed the dinosaurs. Cause u r best bam bam bam bam
I love how this guy cares to explain every single detail, not assuming any prior knowledge whatsoever. I was genuinely shocked when he started calculating the derivatives in the video. Most resources will skip over such minute details and calculations, only focusing on the concepts. These videos are the most beginner-friendly resources on the web for ML, simply amazing.
And also the fact that he doesn't scare is viewers with conventional mathematical notation. He really just gets to the core of the topics.
Thank you so much! I'm really glad you like my teaching style. :)
I really enjoyed this video but it's lack of code. I got a great video here implement sgd using Python!!! Feel free to check it out!! ruclips.net/video/uXuBUkW_0tA/видео.html
@@statquest You are HIM.
@@statquestI really like your teaching style. I will adopt this to my class. Thank you very much.
This man can teach ML to a 5 years baby 😅
Thanks! :)
👶🏻 🍼True.
im 5 and i can confirm this.
@@krshah2008 No you are too young to confirm. 🐶
exactly 😂
Clearest explanation in this universe, as always! Thanks a lot!
Thanks so much! Glad you like it. :)
Give this man the highest medal in teaching. Thanks a lot for all the effort you made to explain this.
Wow, thanks!
Your videos are so well put together. Thanks for all the time you put into preparing them!
HOLY SMOKES!!! Thank you so much for supporting StatQuest! BAM! :)
When ever I feel I am not able to understand any concept after reading any book or any literature available on internet, I open Statequest. This channel should be awarded as the best ever tutorials on Machine Learning concept. You are unbelievable Josh Starmer!!! Thank you
Thank you so much!!!! :)
So let me see, you taught this like teaching a kid by repeating the same thing over and over again and covered such detailed calculation in 20 minutes. It is just amazing how you make short videos and even revise prior concepts at the beginning and still manage to keep everything short. The songs, the BAMS! everything unique about this channel. Amazing learning experience. THANK YOU!
Thank you very much!!! :)
Oh man you are a great. How clearly you explained everything. Thanks
“it sounds fancy, but it’s really no big deal” Thank you for marking me feel the same way about those fancy named methodologies after watching the videos!! I do feel more confident now in learning new stuff! If I don’t get it, it’s not because Of me, it’s simply because the book/paper/course note is not as good as statquest :)
Awesome! :)
to be honest i never ever saw a teacher like him which teaches us and after the sessions we don't have any doubts to even ask. Kudos to him and also thank you StatQuest
Thank you!
You did what Andrew Ng failed to.
Explaning me THE GRADIENT DECENT algo.
Hooray! I'm glad the video was helpful! :)
Completely agree. In fact I just look at the Andrew's course for the topics that needs to be learnt in order. And for the detailed explanation 'BAM' does a fantastic job. Thanks a lot Josh.
@@DixitGokhaleEngineer what is BAM please? I'm taking Andrew Ng's course and I want to learn from as many sources a possible!
@@karisc.anoruo2212 8:51
Exactly , that's why i was searching for something else to understand it since Andrew Ng failed
Of all the explanations I've watched, this is, by far, the best explanation of gradient descent. Thank you for existing !
Wow, thank you!
This is the best video on Gradient decsent on internet. I wish I could have a teacher just like you in my college.
Thank you!
StatQuest. You're Gold, my friend. Pure Gold. 1000 Thanks and bravo. The way you explained it and the graphics was brilliant. Super simple super easy, very educational! Bravo, Bravo!
Hooray!!! I'm so happy to hear that you like this video. :)
I've never come across a clearer explanation for gradient descent,this is so cooool! Love this channel! Thank you so much for making us fall in love with ML!
Awesome, thank you!
Brilliant! 2 years into ML and I still find this the best explanation of GD.
bam!
I wish I saw this video before I attended a 3 hour lecture on GD. Professors have no idea how to explain the basics. Thank you so much, now it all makes sense.
Glad it was helpful!
One of the absolute best educational videos I have ever watched. Based on your explanations I have been able to:
- walk myself through coding a gradient descent algorithm
- understanding the concept of tuning the learning rate (alpha)
- how to handle multivariable problems,
- and more.
I'm studying machine learning right now and this has helped me so much.
I can't say how great this was. Thanks a million!
Wow, thanks!
would you be able to share what you did with the algo
You are a saviour! Whole University wasn't able to make this so simple..you should be given highest honorary award in Data Science
Awesome! Thank you so much! :)
I'm a circuit designer. I'm an idiot when it comes to anything computer science related. You made an idiot understand a daunting concept in AI training. Thank you.
Thanks!
I usually don't comment under youtube videos, but I had to it this time. Currently working on a robotics project and I have to use the gradient descent method to solve inverse kinematics. I didn't understand the method as explained during my lectures and I ended up watching this video. Seriously, I could even explained it to my prof and submit a request so that this video is shown to students during the teaching of this topic.
You're amazing and your explanation is so much easy to follow and understand. Thank you so much for this masterpiece.
Thank you very much and good luck with your robotics project.
I'm struggling so much in Machine Learning right now. After watching just two of your videos I am feeling so much more confident. Thanks so much!
Hooray! Good luck with your studies. :)
Josh, You are a beautiful creature and gift! glad you exist!
your thought process, your explanation, your positivity in teaching this to us, your creativity through art, and your music and albums. Be glad you exist to us all. I am also an electrical engineer, working primarily in software engineering, with a passion for music, numerous instruments, and painting. I am glad I came across you as a person, besides the great education!!
Wow, thank you!
This man is an icon!
He should be celebrated much more!
You have no idea of how well he teaches stuff!
Way to go my man
Thank you!
Thanks for explaining ML in such easy to understand way, what a chad you are sir
:)
Love humor you've added to what can usually be such a dry topic. The "squiggle" for logistic regression made me laugh. Keep up the great content!
Thank you so much! :)
The best explanation I´ve seen so far. Slow, clear and super easy to understand!
Thank you! :)
So helpful!!! Thank you so much! The imagery and step-by-step walk-through were just what I needed! It makes so much more sense now!
Hooray! I'm glad the video was helpful. :)
This has to be, without a doubt, the most understandable and clearly explained video on Gradient Descent on the internet! Thank you so much!
Wow, thanks!
What's really golden about this video is that you don't assume the viewer knows f.e the Chain Rule (even though it's good) and also show all the steps clearly. A lot of Math "tutorials" skip some crucial steps that might not be obvious to someone new to an area/idea, which make the material inaccessible. For comparison, even if your videos lack the great animations of f.e 3Blue1Brown, I think some of your explanations are much easier to understand.
Wow! Thank you very much! I really appreciate it.
I remember what gradient descent does but your explanation on slopes and derivatives gave me a big 'AHA!' moment. Thank you! Instant Subscription! Looking forward to more videos.
can't appreciate enough how clearly the concept is explained. thank you
Thank you very much! :)
the most intuitive and detailed in every point gradient descent tutorial i've ever watched
Thank you!
Normally I don't like when tutorial vids try to be funny or cute in their vids but this video was great. It was just enough to make the lesson not feel so monotonous or droning without getting away from the lesson at hand. It was also surprising easy to understand. Thanks for the vid.
Awesome! Thank you. :)
Boop beep beep boop boop (Translation: Excellent explanation, 5 out of 5)
Awesome! :)
I've been trying to understand gradient descent algorithm for months but all I could find were videos filled with jargon and now i finally understood it within 25 minutes. I love how you simplified the whole thing! This is literally the best explaination I've ever seen! I loved every second!
Thank you very much!
Beautifully explained. For the very first time, I understand this
Hi Josh, I an a new entrant to the field of Machine Learning and was really struggling to get a hold of the topics; understood the concept of Gradient Descend by watching this one video. Thanks a lot Josh , you are the best!
I'm glad I could help! :)
I'm currently proceeding to my Final year of Bachelors degree, This is the minimalist most clear , interactive and funny session I've seen , Love your work, its been a great help for the studies.
Love and respect all the way from Sri Lanka !!
Thank you very much! Good luck finishing your degree! :)
I will pray for your health. So much hard work put into this channel. God bless you
Thank you very much! :)
Thank you so much for making this video!!! I'm struggling in my stats class but after having it broken down step by step like this, I understand it so much better! You are a LIFE SAVER
Glad it helped!
I will summarize what's written in the comments section: WE LOVE YOU.
Hooray!!! Thank you very much! :)
Great explanation! Really glad this video was the first one to pop up when I searched for gradient descent, thanks:)
Awesome! :)
Josh Starmer, you are really good at making videos that explain the difficult concepts in a simple, understandable and engaging fashion. THANK YOU
Thank you! :)
Ahh thank you so much! I recently started working on (or more like preparing to work on) my MD thesis and there's a lot I need to learn by myself. Channels like yours really are lifesavers - you explain everything so well and it's actually entertaining to watch. You have a new subscriber. :)
Thank you and good luck with your thesis. :)
Thank you for explaining this in a simple way I can understand. I'm almost in tears right now 🥲
Glad it was helpful!
You explain complex things to me better than my professors. Thanks for saving my semester. ❤️
Happy to help!
Dude explained things clearly. Huge thanks! Helped me review what I’ve learned. I feel much better.
Thank you so much!
I swear this is the most clear and FANTASTIC explanation I've ever found
Wow, thanks!
best of the best explanation in this word !!!!!!!!! Thanks a lot!
OMG this is pure gold !!! Thank you so much for the time in compiling the video and sharing the insight
Thank you! :)
Love your work Josh; a lot of planning and sequencing have gone into the video production - you deserve a RUclips 'Oscars'. Your channel is making an impact to my learning journey👍🙏
I can't thank you enough for making these informative and immensely useful videos. I think this is the first time I am using RUclips to learn something useful, and your videos make it so enjoyable
Glad you like them!
Finally lots of search on the internet , again I reached to the same place where things are explained easily with heart.
Really you are good hearted as like the concept.
Thank you sir, we owe you.
You are most welcome!
I have never seen anyone explain gradient descent as clearly as you!
Thank you!
Teacher: "what we knew about Gradient Descent?"
Me:"THE CHAIN RULE!!!!!!!!!!"
YES! :)
Bup bup bip bip bup bip bup
Love how clearly you explained, the recap in middle sure helps too, doesn't let us get lost !! More power to you 🙌🏻
Thank you! :)
You explain things so slow and step-by-step as it was being explained for chimps. Exactly what i needed. Thank you a lot.
Glad it was helpful!
Best stats teacher on the planet!! Thank you for your videos Josh!
Great explanation starting with the basics and explaining it step by step! Brilliant!!!
Thanks! :)
Your videos are the best. No one could have explained it better. Love the sense of humour in between to spice things up. Thank you so much!
Thank you! :)
I have no words to say how amazing you are. Not only are your videos super easy to grasp, but also, wth, even after 4 years you're replying to your comment. Wow! Amazing
Bam! :)
Me encanta! Eres todo un tezo como decimos aquí en Colombia. Love it, you are such a "tezo" (brilliant) like we used to say here in Colombia
Muchas gracias!!! :)
Hola soy nuevo en esto. Me podrias decir de donde salio el 0.64 del slpoe o pendiente al principio del video. Gracias
@@simetric6551 I think that it probably came from the previous calculation upon the example data.
THE CHAIN RUUUUUULE !!! :-D ...... I Love your videos, best courses I ever seen
BEST VIDEO ON GRADIENT DESCENT!!!! PERIOD.
Thank you!
Whenever I get hung up about some hard to understand topic, I remember Josh waiting you in StatQuest; so be relax and enjoy learning something new with him; without any doubt.
Thank you! :)
There couldn't be any better explanation for Gradient Descent !
Thank you! :)
This channel is awesome...It should have millions of subscribers....I have become a fan Josh...big fan...
Thank you for your video, I've been using this for years already and didnt really understand the mechanics behind, all i know is that I have a general idea that it's used for. With your video, can now understand and even explain it to other people without saying "it just magically works"
Bam! :)
The best explanation of this topic I've come upon so far, helps me a lot while learning the gradient descent. Thank you so much!
Thank you!
This video is just perfect to gain an intuition of gradient descent (what math heavy lecture slides fail to deliver)
Thanks! :)
Loved the way your teaching, u made mathematics my favorite subject
Hooray! :)
My first subscription in youtube, even though i watch hundreds of videos every month. Such a quality content :)
Thank you!
you are a blessing, you give a high level explanation and with a little background knowledge, we can _descend_ into the depths of formality that university courses go into ourselves!
Thanks! :)
I'm following Andrew Ng's Machine Learning Course. I thing he should give a reference to your video for an explanation. That is superb, provided that we know some elementary calculus.
That would be awesome if he did that. A lot of people come to my videos via Andrew Ng's course, but they have to find it on their own, just like you do. Perhaps you could suggest it to that class. Regardless, I'm glad you like my videos. :)
@Beyond Oblivion me three
I envy modern young math students in colleges all over the world since they can easily choose best explanation of complex things on RUclips( Your channel seems the best I found!). It;s not like studying in the beginning of 2000's when the teacher was drawing some ugly formulas with chalk. making so great thing as maths boring to us. Who knows may be someone who will treat cancer or colonize Mars will pick up some missing brick of understanding from your videos. Pretty much thanks for your effort!
Wow! Thank you very much! :)
This is insanely helpful Josh!!! I'm trying to understand neural networks and made the mistake of resorting to academic papers which use complicated language and unhelpful explanations. I read those for hours and only understood bits and pieces but a 25 minute video on this youtube channel helped me understand better than ever!
Awesome! I also have a bunch of videos on Neural Networks that walk you through each part step-by-step: ruclips.net/video/CqOfi41LfDw/видео.html
I am doing a academic research project, and I thank you for really showing me the significance and utility of Gradient Descent!
Glad it was helpful!
Thanks a lot sir
Clearly explained
Awesome video.
Thank you! :)
Hi Josh this was really helpful. Can you do deep learning and neural networks next?
Those are definitely on the to-do list, but right now I'm working on Gradient Boosting, so that's next.
@@statquest Any chance to get the Gradient Boosting video withing a week? That would be specially helpful due to some personal urgency!
Explanation is crystal clear and spot on. Thanks Josh for investing your time and energy to create such wonderful learning material.
Glad it was helpful!
this series is by far the best detailed deep-learning course on youtube
Thank you!
you accidentally taught me how to take derivatives in a few seconds which i never understood for years... mind is blown.
That's awesome!!! :)
really? It took you years to apply the chain rule?
Great presentation as always! One note, at 18:53 I believe you forgot to take out the ^2 for the second and third slope-derivative terms of the residuals. But the idea is clear anyway.
I'd like to see this implemented for clusters and logistic regression as well, and what loss functions are best for them and why.
Maybe also a bit more detail about why the slope multiplied by the learning rate is the best way to find the step size. You mentioned it, in terms of the desire to approach a slope of zero (therefore it makes sense to have a negative multiple of the slope be the penalty, since a large slope would lead to subtracting a large amount and a small slope would lead to subtracting a small amount). As I said, you mentioned it, but perhaps a bit more visual demonstration would be good. Great though!
P.S. As an additional source of revenue, you might 'sell' a zipped version of all your videos for a donation, or perhaps put it into a book that can be downloaded for a donation.
Ah - there are always a few typos. Darn. I thought I'd caught all of them. But, like you say, the point is clear, so I won't lose any sleep over it, I hope! If you're curious about how gradient descent works for t-SNE, just look at the original manuscript for it (simple google search). They show the gradient that they use.
@@statquest Okay, thanks for the pointer!
@@statquest Also at 11.05 you say that the step size is "negative five point seven".. I presume you meant to say -0.57, as it is written on the slide.
Agree that the presentation is really really great! And just one more note besides the typo at 18:53: there are two pairs of concepts slightly a bit confusing in this video -- (a) weight for linear regression vs the x variable of human weight (b) slope of the linear regression fit and slope of the sum of squared residuals vs intercept. For (a), as slope of linear regression fit is often referred as 'weight' of the function y^ = w* x+b, using a variable of human weight could be a bit confusing, it might be better if you call it 'human weight' to distinguish with function weight, or find some other variables instead of human weight. For (b), when you talk about 'slope' in gradient descent, I sometimes thought you were talking about the slope of the linear regression, and then realized it's another slope. Maybe it will be good if you use 'gradient' for the second slope?
Thanks a lot. I'm from Bangladesh. It means English is not my native language but I have learnt many things from the video. All the visually representation was really amazing.
Thank you! I'm glad the video was helpful.
Please keep coming up with videos. The possibility of getting my Master in Data Science is solely depend on you now lol
Which university?
The beep boop really caught me by surprise. Laughed out loud in front of my computer...
Bam! :)
I have no idea what gradient descent for a year, and now i finally know what it is. Thank you so much.
Awesome! I'm glad the video was helpful.
I love that you always put the entire script in the slides. This way we can pause and really let the things you just said sink in. My teacher is probably a good teacher, but sometimes the things he says go into my left ear and leave through my right. Would be nice if I could pause and replay him aswell lol.
bam! :)
i gained new intuition on gradient descent that i never realized during university because of this video. thanks a MILLION josh!
Hooray! I'm glad the video was helpful. :)
Andrew Ng needs more songs at the beginning of his lectures
That would be awesome! :)
@@statquest in all seriousness, this was an excellent video to really cement the idea. Staaatttquessst.
when you explained all the steps to find derivative i laughed but after realising you do all this just to make sure everyone understands i felt so blessed to have people like you ❤
Thanks!
you explained so clearly my dog started calculating gradient descents!!!!thanks man
That's awesome! :)
Like before watching.
Is there any option for DOUBLE likes to append an extra like after watching
Thank you! I'll put a request in to RUclips for the Double Like button. :)
Double BAMMMM........ Please post more videos sir....
Thanks! I'm working on new videos as fast as I can. :)
NEVER FORGET, u are the best person i have ever met in youtube, THX for everything i hope u are feeling great all day
Thank you very much! :)
i wonder how you have understood math concepts so deeply that enables you to express and explain incredibly clearly.
many thanks professor
I spend a lot of time working on these videos and go through many drafts before I have something that works.
I'm final year student (Master degree in Mathematics) and this is my Project 🤮 . Now little bit clear what gardient decent is.
Gardient is quite decent
How did you get to a masters in math and find that kind of stuff uninteresting?