How I think about Gradient Descent
HTML-код
- Опубликовано: 12 апр 2024
- What is gradient descent optimizing exactly?
Source code to generate these animations: github.com/gallettilance/repr...
#gradientdescent #machinelearning #neuralnetworks #optimization #math #datascience #educational #machinelearningtutorialforbeginners #datasciencebasics #datasciencetutorial #machinelearning #datascience #datasciencebasics #datasciencetutorial #machinelearningalgorithm #logisticregression #machinelearningbasics #maths #softmax #multinomial #classification #linearregression #probability #probabilitytheory #education #math #machinelearningtutorialforbeginners #machinelearningtutorial #neuralnetworks - Наука
I watched 3blue1brown and thought there was nothing else to learn about gradient descent. I was wrong. Thank you for the video!
thank you so much for the kind words!! It means so much
This is an AWESOME introduction to gradient descent! I also love that it's more of a high-level overview rather than delving into the nitty gritty details of the calculus required to make it happen- it's surprisingly beneficial for those that are already used to the concepts. Looking forward to watching the Part 2 soon!
So glad to hear that! That means a lot to me especially in this early stage of starting this channel! And I completely agree. The nitty gritty often comes in the way of truly understanding certain concepts but since these are often the only details we're tested on in school it's hard to realize that something is missing.
Wow what an informative and clear summation with such cool animations well done!
loved the video, the format, the animations. hope to see more from you
thank you so much for the encouraging words!
Just discovered your channel. Amazing content! Thank you very much for your work, looking forward to see more of it!
It's definitely hard work to make these videos but comments like yours make it so worth it - thank you so much!
Awesome video. It's super intuitive looked in this way
Thank you so much! That makes me so happy to hear!
Very nice video!
I think there's a slight issue: the derivative of x^5 - 40x^3 - 5x can be solved really easily. Its derivative is just 5x^4 - 120x^2 - 5, and you can set that to zero, substitute u for x^2 to get 5u^2 - 120u - 5 = 0, use the quadratic formula to solve for u, take its square roots to get x, and check which is lowest in the original f(x). But the specific equation isn't what's important, and the video is very nice otherwise!
(by "the derivative can be solved really easily" I mean "you can easily find the zeroes of the derivative")
You're absolutely right! I had to make a decision as to what is "easy" to solve and decided that u substitutions are not :D But great to point out I'm sure many watching the video will learn something from your comment!
First year CS major here dipping my toes in ML and this explanation makes a lot of sense, would love more videos like this! Subbed.
That’s so great to hear, thank you for your encouraging words! Please feel free to suggest topics you would like me to cover
Awesome. We'll watch as many of these as you're going to make.
Cool video!!!!
Brooo, really loved your content haha. To better add to the analogy, lets say you have your eyes closed/you have no touch sensation, i think that might be a great idea but correct me if i'm wrong. In any case, really loved the video, keep the good work 🎉🎉
Love that idea! If I find a way to visualize it I'll include this in part 2 :)
@@howithinkabout sounds awesome! I will be watching it ;)
Mitochondria are the powerhouse of the cell
wow!
How would the concept of momentum tie into your explanation? Because when using ADAM one usually specifies the learning rate (which is the step size) and the momentum
Great question! I'm planning to talk more about variants of GD (which includes ADAM) in a next video about how to avoid some of the pitfalls of GD. But the tldr is this: adam tries to use historical information contained in the successive gradients to make better step adjustments.