Gradient Descent in 3 minutes

Visually Explained

Просмотров 230 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 15 янв 2025

Комментарии • 91

@user-sb6os 3 года назад ⁺⁷⁵
This animation is really great for a small channel like this
@kitzelnsiebert 3 года назад ⁺⁴⁵
Great job! I like that you were even able to talk about some of the different types of gradient decent algorithms, tall task for 3 minutes.
@byronwilliams7977 2 года назад
Same here. i studied Applied mathematics, so I have to get up to speed on this rather quickly, I find these videos to be excellent.
@myelinsheathxd 3 года назад ⁺⁹
That's appealing to see these visual explanations after learning the the concept !
@prakhargupta1745 Год назад ⁺⁴
just a correction, 2:30 is mini-batch stochastic gradient descent since we are iterating over batches
@billyin4771 2 года назад ⁺¹²
Very comprehensive and short, love it! Quick and concise!
@VisuallyExplained 2 года назад
Thanks so much!
@vivekpujaravp Год назад ⁺¹⁸
Best explanation and visualization I've seen. You have incredible talent. Please keep making more.
@Leo-vv3jd 7 месяцев назад ⁺²⁷
I really liked the video and the visuals, but I think it would be better without the "generic music" in the background.
@VisuallyExplained 7 месяцев назад ⁺⁷
Thank you for taking the time to post your feedback, this is very useful for the growth of this channel!
@0N3_01 9 месяцев назад ⁺¹
Thank you for the clear explanation
@snowcamo 7 месяцев назад
Honestly didn't really help with my questions, but I didn't expect a 3 minute video to answer them. This was very well done, the visualization was great, and everything it touched on (while brief) was concise and accurate. Subbed.
@hrishikeshsrivatsa 2 года назад ⁺³
First video, where I got clear and precise understanding of the topic
@209_Violate 10 месяцев назад
wow. you have such a talent for explaining things so well compared to the rest of the youtube sphere. i hope you will continue to bless us with your talents.
@ah-rdk 9 месяцев назад
Hello. Thanks for this great video.
Just I believe at 2:20, this variant of Gradient Descent that you explained is called the Mini-Batch Gradient Descent which uses a random subset of the training dataset.
Stochastic Gradient Descent is the one that uses just one training record in each iteration.
@khabib8568 5 месяцев назад
Yeah i have the same doubt
@brainxyz 3 года назад ⁺³
Great Job! well done
@VisuallyExplained 3 года назад
Thanks a lot!
@bilalb95 Год назад
Thanks for the amazing explanation and visualization
@juanete69 2 года назад
It mixes very well the theory and a practical example.
@unplandsitch Год назад
Wow this is so well, intuitivly explained
@pdebuck1 3 года назад ⁺¹
My professor said “there is no excuse for gradient descent” when conjugate gradient is so easy to implement
@World_Admiror 11 месяцев назад
Keep up the good work! This viedeo and the whole chanel are amazing!
Bravo 👏🏻
@A0G7 5 месяцев назад ⁺¹
i looooooooooove when the background music stops. it tells you to open your eye and focus your ear, a really important revelation is coming ....
@hariraj4184 3 года назад ⁺²
Brilliant work 👍
@AK56fire 3 года назад ⁺¹⁰
Could you please make a video explaining how you made this video.. That would be very VERY helpful.. I've always wanted to use blender to make such animations as you've made, but couldn't make head or tails of it in blender. Most tutorials(and I've seen more than 100 videos) on blender showcase heavyduty animations which have nothing to do with mathematical explanations, as in how to make animations for maths related videos.. Your's is the first video in which I've seen such a thing. Please consider my request and kindly make a video tutorial about it(for the Blender part).
@VisuallyExplained 3 года назад ⁺⁹
Hey Amit, I am definitely planning to make a video about my workflow, and in particular, how I make the animations. So stay tuned for that! :)
@AK56fire 3 года назад ⁺⁷
@@VisuallyExplained Will definitely wait for it.. Thanks for considering it. I appreciate it a lot.
@photogyulai 9 месяцев назад
Huge thanks, it would be fantastic (fellow teacher here:-)
@olivier306 2 года назад ⁺¹
Literal gift from god this channel
@YassinP10 19 дней назад
Bravo sadiki
@MARCELSOCORÓGARRIGOSA-l6h Год назад
I have only seen one video and it is helping me a lot! Keep going!
@alanmdl Год назад ⁺¹
great video but at around 1:30 my heart dropped it felt like a scary movie since it was all dark lol
@eumatheus Год назад
Thanks for this, the visualization helps a lot!
@ranjanpal7217 Год назад
Amazing explanation and visualization
@sayyidj6406 10 месяцев назад
excellent vid. do you have a video about PPO in RL?
@thebrucecyou 3 года назад ⁺¹
Excellent video!
@dannyk123 Год назад ⁺¹
Great video, was wondering how this works if there are multiple minimum points where the data has high dimensionality?
@herronproductions829 Год назад ⁺¹
It works just fine with multiple minimum points in a high dimension. As long as you configure your hyperparameters (learning rate, batch size, etc) correctly, you should have no problem converging to a "decent" minimum.
@rasmil77 5 месяцев назад
Helpful for revising the topic :)
@alirezaakhavi9943 11 месяцев назад
really nice animation, explanation and content than you very much for sharing! :)
@error-my9ut Год назад
thanks for visualization it really helped .
@Darkev77 3 года назад ⁺³
Proximal GD next please!
@MrWater2 Год назад
funcking incredible explanation in just 3 minutes..wow!
@hassanabdullah6742 10 дней назад
It is basically using the principle of induction to create a cardinality symmetry.
@incrediblekullu7932 3 месяца назад
how you make these videos ?
manim ??
@Hateusernamearentu 5 месяцев назад
very smart. But I still need another video presenting differential to help understand the slope, opposite direction thing in 2D. But this is clear. Also I like the small step demonstration.
@zzzzzz-zzz-789 Год назад
great video, thanks!
@na50r24 Месяц назад
For Vanilla GD, are you not supposed to divide by the number of samples in data before performing the update? Or do you just take the sum of this 'accumulated gradient'?
@benedictcoltman1983 14 дней назад
Beautiful 👍
@fabricetshinangi5042 3 года назад ⁺¹
Can you please post a link or titles of materials(books) on this topic tsht one can go through . I really need to learn this topic. Thank you
@VisuallyExplained 3 года назад ⁺³
Great idea! Boyd's book is a good starting point (page 463 of web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf). I will try to add more references to the video description in the future.
@fabricetshinangi5042 3 года назад ⁺¹
Thank you
@abdulghanialmasri5550 3 года назад ⁺¹
Great video, would you consider some topics in numerical analysis, like gaussian quadrature???
@AK56fire 3 года назад ⁺²
Which software did you use to make those animations.
@VisuallyExplained 3 года назад ⁺⁵
I used Blender3D (with python) for all 3d scenes. The rest is a combination of After Effects and the python library manim
@md.sarowarhossainrana4787 4 месяца назад ⁺¹
what is the ita?
@tarekbenzyad6766 2 месяца назад ⁺²
i was about to ask the same thing , he suddenly introduced it into the cost function as a paramter then never talked about what it meant
@JuanCarlosAraujoCabarcas 2 года назад ⁺¹
Interesting topic and comparison. Since you are using information from past iterations, it would be very illustrative to include a quasi-newton in your comparison. For example the BFGS.
@VisuallyExplained 2 года назад ⁺¹
Thanks, and great suggestion!!!
@AK56fire 3 года назад ⁺¹
Great Animation buddy.. Cool..
@VisuallyExplained 3 года назад ⁺¹
Thank you! Cheers!
@wearedoingsomething 2 года назад
Huge thank you!
@jayp9158 Год назад
My fav video about this
@Urammar 10 месяцев назад
How does this actually apply in reverse, though? How do you apply this
@charlesyang7233 2 года назад
This is amazing
@TimCrinion-j2r 9 месяцев назад
Is this the same as Newton's method, or the Newton-Raphson method?
@viddeshk8020 3 года назад
Wow, 😊❤️ love it
@VisuallyExplained 3 года назад
Thank you! Cheers!!!
@arnaupinto5890 3 года назад ⁺¹
awesome
@emirhandemir3872 Год назад
Amazing !!!
@John-wx3zn 9 месяцев назад
What is gradient descent trying to find?
@at1with0 9 месяцев назад
To minimize cost/error of a learning model
@Zethuzzz 9 месяцев назад
Adam usually works quite good
@user-rw6iw8jg2t Месяц назад
Isn't it awesome in a simplified way, I was juz implementing OLS for Vanilla linear regression to train the neural networks with some weights and bias and this video popped up was doing some stuff with thje matrix and dot product , I luv Mathematics !!! one thing when we have OLS algo directly why we need to implement OLS with GDA again then wts the use of having OLS algo seperately is it coz of Volume of the data points ?
@cedricmanouan2333 Год назад
magnificent 🔥😧
@tsunningwah3471 Год назад
simplex method please
@guillecobo_ 3 года назад
great video
@VisuallyExplained 3 года назад
Great comment, thanks!
@marinamaged962 2 месяца назад ⁺²
i dont get it lol
@FaberLSH 6 месяцев назад
Good!
@andrewt15 3 года назад ⁺²
dope
@yassinom2466 Год назад
veeeeery nice
@Hans_Magnusson Год назад
As you might know I studied this topic in London…
I obviously aced it.😂
@-leaflet Год назад
Wow
@Christoo228 7 месяцев назад
sagapo
@fulljsu3glitches Месяц назад
And how the fuck i get the n ??
@tsunningwah3471 5 месяцев назад
房山
@mohammadnasser9951 2 года назад
Damn calculus.
I understand nothing.

Следующие

Автовоспроизведение