Newton's method | Backtracking Armijo Search | Theory and Python Code | Optimization Algorithms #2

Поделиться
HTML-код
  • Опубликовано: 27 сен 2024

Комментарии • 171

  • @eyupbayhan3394
    @eyupbayhan3394 Год назад

    You explain things so slow and step-by-step as it was being explained for chimps. Exactly what i needed. Thank you a lot.

  • @elifklc132
    @elifklc132 Год назад

    I will give away a million dollars to you when I have say 10 million dollars so you can continue do good to the community (smiles). You're just amazing Ahmad Bazzi, pls stay alive and dont let these videos disappear from here.

  • @turkintroadam183
    @turkintroadam183 Год назад

    God bless you boss. I have really struggled today for close to 24 hours in search for g2bbage online till I got your video.

  • @maryagyemangkwartemaa7115
    @maryagyemangkwartemaa7115 Год назад

    Whenever I get hung up about some hard to understand topic, I remember Ahmad Bazzi waiting you in Ahmad Bazzi; so be relax and enjoy learning something new with him; without any doubt.

  • @koko-tegwolocomedytv8273
    @koko-tegwolocomedytv8273 Год назад

    Thanks so much for your feedback, really appreciated!

  • @bilmiyom6773
    @bilmiyom6773 Год назад

    After all these years I finally understand the magic behind gradients! Thanks!

  • @hamzaandmaliksworld6005
    @hamzaandmaliksworld6005 Год назад

    I swear this is the most clear and FANTASTIC explanation I've ever found

  • @fm2680
    @fm2680 Год назад

    Amazing explanation! Greetings from Brazil!

  • @muhammed-pj5gi
    @muhammed-pj5gi Год назад

    Very well explained and easy to follow. Thank you very much Sir.

  • @flexxbabymusic5329
    @flexxbabymusic5329 Год назад +1

    I was in pain because of the concept of gradient descent and its relationship with the line of best fit. I just can’t believe that someone can explain it so well. Keep it up!

  • @مسمكةالنور
    @مسمكةالنور Год назад

    One of the best lectures here on RUclips !

  • @berksen6172
    @berksen6172 Год назад

    Best Video of gradient descent I've ever found in the universe!! Thanks for saving my life

  • @ffsilentgaming7922
    @ffsilentgaming7922 Год назад

    god i am so glad you labeled stuff like step size and direction, my teacher didnt and all it did was waste like 30 mins of my time

  • @yigiteslem3932
    @yigiteslem3932 Год назад

    Sir, your lectures are awesome! Thank you very much.

  • @dizifragmantv8189
    @dizifragmantv8189 Год назад

    Superb explanation sir. Love from India.

  • @21-cisr89
    @21-cisr89 Год назад

    What an amazing animation

  • @mahmutkrca5943
    @mahmutkrca5943 Год назад

    The best one on the internet

  • @cintyakireina7f285
    @cintyakireina7f285 Год назад

    You are the best teacher ever Ahmad Bazzi... I'm going Andrew Ng ML course and dint understand what the hell was Gradient Descent.. so came to RUclips and found your video...BAM..Thank you so much for doing these videos.... Keep em coming...💜

  • @calinity5881
    @calinity5881 Год назад

    Simple and clear ... Yet need more detailing ...!!!!

  • @ardaozcan2696
    @ardaozcan2696 Год назад

    I just want to thank you so much that I can't even express it! You have done such a good job on explaining this, thank you very much. The problem is that I am in 7th grade and no one knows what a conjugate gradient is (at my school). I am studying deep reinforcement learning (more exactly TRPO), and you are my life saver. I couldn't understand it for 2 weeks, but finally I did, thanks to you!

  • @akif1586
    @akif1586 Год назад

    Glad to hear it!

  • @memememe2488
    @memememe2488 Год назад

    That was a great explanation of Gradient descent and in an amenable way !! Thank you for the great video!

  • @mustafayldz6931
    @mustafayldz6931 Год назад

    Hello from the USA 🇺🇸

  • @abhinavvv6387
    @abhinavvv6387 Год назад

    Very nice explanation!!

  • @1johnnyzMusiq
    @1johnnyzMusiq Год назад

    God level Explanation... 😍😍😍😍😍😍😍😍😍😍😍😍😍😍😍😍

  • @napimxd380
    @napimxd380 Год назад

    Great video that explains gradient descent perfectly

  • @abonehile1705
    @abonehile1705 Год назад

    Thanks for the question. We choose to model our measured data for example g by H theta_QP, so by our own modelling definition this is true. In reality the model may be inaccurate, but the equality holds if H = A^T A

  • @zalvotv4906
    @zalvotv4906 Год назад

    Thank you for the comprehensive lecture, Professor Ahmad Bazzi :)

  • @grand4540
    @grand4540 Год назад

    very good explanation.

  • @xakkeruzb2488
    @xakkeruzb2488 Год назад

    Glad it was helpful!

  • @merhabakok7080
    @merhabakok7080 Год назад

    awesome explanation ...

  • @ibocanayzgk5726
    @ibocanayzgk5726 Год назад

    Great tutorial.

  • @m10n69
    @m10n69 Год назад

    each and every notation is explained serially, step by step with its meaning and relation to the problem

  • @AnishRaj-uh9fl
    @AnishRaj-uh9fl Год назад

    If you're wondering why, when we have least squares, would we want to use gradient descent... the answer is that least squares only works in specific situations and gradient descent can work in many more.

  • @zanicardi
    @zanicardi Год назад

    Great sir. ...👍👍💯

  • @moustafarahal3396
    @moustafarahal3396 Год назад +1

    Interesting!!

    • @AhmadBazzi
      @AhmadBazzi  Год назад +1

      Thanks Moustafa ! Glad you found it interesting !

  • @jacksonlima4224
    @jacksonlima4224 Год назад

    Awesome lecture and tutorial going into the details of Armijo and newton.

  • @اغانيالجيلالذهبي-ن8ح

    Excellent.

  • @melisacalar1907
    @melisacalar1907 Год назад

    Great...👌👌👌

  • @eylulnihankalayc8009
    @eylulnihankalayc8009 Год назад

    Thank you Ahmad !

  • @azrawtfgt244
    @azrawtfgt244 Год назад

    Good explanation but would have been better if you elaborated its formula of why it is used to reach next step. Why is derivative multiplied by learning rate and why it is then substracted from first point value

  • @dummyacc9840
    @dummyacc9840 Год назад

    Greetings from Germany 🇩🇪

  • @keremkalcn2915
    @keremkalcn2915 Год назад

    Very helpful to remember easily n even actually at the last moment without any confusion... thank you sir!...i just have one question for relaxing method the is procedure will same?

  • @mhmmdhuseynov2910
    @mhmmdhuseynov2910 Год назад

    Damn it felt like watching a statistics - Data science - Machine learning tutorial from a SpongeBob SquarePants episode! That was interesting and funny at the same time. Well done!

  • @enesguney8213
    @enesguney8213 Год назад

    When the derivative is negative, then we need to move to a larger parameter value and the step-size is also negative. Because we need to move to a larger new value, we subtract the negative step size. When the derivative is positive, then we need to move to a smaller parameter value and the step-size is also positive. Because we need to move to a smaller new value, we subtract the positive step-size.

  • @savosavo7790
    @savosavo7790 Год назад

    Thanks Ahmad !

  • @sweetnspice76
    @sweetnspice76 Год назад

    In theory, the function could equal 0. However, in practice, we do gradient descent on computers and computers have rounding errors and due to rounding errors and whatnot, the function never equals 0 exactly.

  • @teetoffcial0
    @teetoffcial0 Год назад

    Gut gemacht. Danke!

  • @mohamedfahd3180
    @mohamedfahd3180 Год назад

    Hooray! :)

  • @foxgun3d70
    @foxgun3d70 Год назад

    You are a Hero

  • @iege0026
    @iege0026 Год назад

    very clear

  • @varietyplays8781
    @varietyplays8781 Год назад

    You are a superhero

  • @Zenin._Toji.
    @Zenin._Toji. Год назад

    Thanks in advance!

  • @lionsupra6952
    @lionsupra6952 Год назад

    Wow, thanks!

  • @byelmin352
    @byelmin352 Год назад

    Thank you!

  • @aqil8876
    @aqil8876 Год назад

    Thank you very much! :)

  • @JAKEYENER
    @JAKEYENER Год назад

    Thank you so much 😀

  • @lury2128
    @lury2128 Год назад

    May you get all the resources you need to keep making these videos.

  • @photoplayer9405
    @photoplayer9405 Год назад

    4:52 WOW ! What on earth did I just see ?

  • @XLadiNX
    @XLadiNX Год назад

    it will give the intercept value immediately

  • @pkyadavtrainer7788
    @pkyadavtrainer7788 Год назад

    Need more sir

  • @muhammedarslanpinar4629
    @muhammedarslanpinar4629 Год назад

    Thanks!

  • @barsdesign250
    @barsdesign250 Год назад

    @Ahmad Bazzi Ok thank you will watch it and let you know whether or not my confusion is finished

  • @gameinfo8611
    @gameinfo8611 Год назад

    It is given. alpha lies between 0 and 0.5 and beta lies between 0 and 1.

  • @tekbasina696
    @tekbasina696 Год назад

    Ahmad Bazzi thanks a lot ❤️

  • @AadityaBarma
    @AadityaBarma Год назад

    Thanks! :)

  • @zamanszsaat7801
    @zamanszsaat7801 Год назад

    This video really helped me clear it up and your example was very simple and useful! I am confused as to why you made beta = 0.707... what made you choose this? is it just a standard?

  • @kanofrmda3
    @kanofrmda3 Год назад

    That's a known typo mentioned in a pinned comment.

  •  Год назад

    Thanks for the great video! One question: 1:03 where do you get or derived that while condition? Can you please provide some materials? Thanks!

  • @carryminati1593
    @carryminati1593 Год назад

    Sir can i ask , please do explanation on quasi newton methods.

  • @benbiribenbiri47
    @benbiribenbiri47 Год назад

    While considering sum of the squares of the residuals as the loss function, why can't we just equate the slope of loss function to zero to get best intercept instead of plugging in many intercepts and check

  • @huseyinpolater5231
    @huseyinpolater5231 Год назад

    I am assuming H = A^T A (where A is a forward model, such as a discrete Radon transform to a sinogram for an imaging example). But A, and so H, will depend on your model of the mean of the measured data. It can vary a lot according to the problem being solved of course.

  • @neuroderek1895
    @neuroderek1895 Год назад

    Should he need to wear the subjective refraction results when doing the pct at far ?

  • @cyrus_m2s
    @cyrus_m2s Год назад

    BAM!!! Great explanation of gradient descent. I too have a doubt on this, does the readymade packages of python and R like sklearn use gradient descent for calculating the linear regression slope and intercepts.

  • @ismailberk9280
    @ismailberk9280 Год назад

    didnt explain how do calculate the direction we are moving to (the minus), why the derivatives etc

  • @dakshghare2296
    @dakshghare2296 Год назад

    Hey Ahmad Bazzi , @19:20 the last two elements of the derivative with slope shouldn't have a power of 2. Instead, they should be to the power of 1. Please respond if my understanding is correct. Thanks!

  • @gonzalo3484
    @gonzalo3484 Год назад

    The math for gradient descent is pretty simple, all you really need to understand is The Chain Rule: ruclips.net/video/wl1myxrtQHQ/видео.html - and The Chain Rule comes up so much in Machine Learning that it really is worth learning. However, if you really don't want to learn it, you can get by by just understanding the concepts.

  • @chuannelcompilation866
    @chuannelcompilation866 Год назад

    on 9:21 why you didn't use the maxima minima concept and equate derivative to 0?

  • @bs_garibanlar8753
    @bs_garibanlar8753 Год назад

    The direction of minus got be on the opposite no ?

  • @jaypratap2177
    @jaypratap2177 Год назад

    I just have one question. You said that we don't know Theta_QP but we know that H Theta_QP is the measured data vector g. Where do we know that from? It would be really amazing if you could anwser this question

  • @0WiLLBLooDD1
    @0WiLLBLooDD1 Год назад

    @Ahmad Bazzi, do u have any tecnhical report on gradient descent?

  • @gigglegamer6349
    @gigglegamer6349 Год назад

    BAM !!!

  • @tiktoktankesitler4661
    @tiktoktankesitler4661 Год назад

    YOU ARE GOD

  • @snehadeka6241
    @snehadeka6241 Год назад

    How did you find the slope = 0.64?

  • @hamzaaslan6838
    @hamzaaslan6838 Год назад

    Why is the slope value taken as 0.64 ?

  • @mrtcmert4483
    @mrtcmert4483 Год назад

    are u using L2 norm? in l2 norm , its predicted - observed?

  • @amcom4817
    @amcom4817 Год назад

    BAM!

  • @shehrozgamer7038
    @shehrozgamer7038 Год назад

    BAM! :)

  • @BrawlFaruk
    @BrawlFaruk Год назад

    Before calculating intercept how did you assume slope as 0.64 (differentiating wrt intercept) and vice versa ? please clarify

  • @PROGAMER-bn7gr
    @PROGAMER-bn7gr Год назад

    Hi Ahmad Bazzi,

  • @murattopal6851
    @murattopal6851 Год назад

    OK

  • @coolding7630
    @coolding7630 Год назад

    Ahmad Bazzi

  • @vatansirtbas5374
    @vatansirtbas5374 Год назад

    fucking amazing.

  • @SHADOW_IS_LIVE
    @SHADOW_IS_LIVE Год назад

    I found a diamond

  • @berkantimur1556
    @berkantimur1556 Год назад

    Explicación para un no-matemático

  • @yalnizadam3_
    @yalnizadam3_ Год назад

    No one explaines why are we taking this bloody curve as an example.. why intercept and why slope.. why!!!!!!!!

  • @kartalgaming8928
    @kartalgaming8928 Год назад

    :)

  • @warriorsgamingclub4529
    @warriorsgamingclub4529 Год назад

    God bless you boss. I have really struggled today for close to 24 hours in search for g2bbage online till I got your video.

  • @halit2003
    @halit2003 Год назад

    Very well explained and easy to follow. Thank you very much Sir.

  • @emreis5949
    @emreis5949 Год назад

    I just want to thank you so much that I can't even express it! You have done such a good job on explaining this, thank you very much. The problem is that I am in 7th grade and no one knows what a conjugate gradient is (at my school). I am studying deep reinforcement learning (more exactly TRPO), and you are my life saver. I couldn't understand it for 2 weeks, but finally I did, thanks to you!