3.4: Linear Regression with Gradient Descent - Intelligence and Learning

Поделиться
HTML-код
  • Опубликовано: 10 окт 2024

Комментарии • 172

  • @FANSasFRIENDS
    @FANSasFRIENDS Год назад +7

    This is the amount of enthusiasm I need from my professor.
    Keep up the good work, sir!

  • @josephgigot8827
    @josephgigot8827 7 лет назад +7

    You are a really great teacher. Watching you, we are feeling that you re-discover what you already knows with us ! I think it is the perfect way to learn people knowledges !

  • @luisa534
    @luisa534 2 года назад +4

    got stuck on gradient descent from the andrew ng coursera course, so as always, I'm back here for more digestable explanations. love your teaching style!

  • @LearnWithYK
    @LearnWithYK 3 года назад +1

    Excellent. Love the way you present - enthusiastic, excited, but totally at ease.

  • @Kevin-ex9vr
    @Kevin-ex9vr Год назад +1

    man, this series with both board and coding together is really the best from yt, congrats

  • @christophersheppard3249
    @christophersheppard3249 Год назад

    You single handedly made me go into cs. Thank you for your inspiration.

  • @sagauer
    @sagauer 7 лет назад +2

    Hey, I am watching your channel for the first time and I am amazed how good you explain things! I am a teacher myself and I find you very inspiring!

  • @anubratanath5342
    @anubratanath5342 3 года назад +1

    This is the most intuitive explanation of linear regression. Thank you sir!

  • @MrGlitch888
    @MrGlitch888 3 года назад +1

    Keep up the good work. Your teaching is the best, especially when it comes to complicated topics.

  • @franciscohanna2956
    @franciscohanna2956 7 лет назад +2

    Great videos Daniel! Thank you! I started a IA course at college this semester (it's almost ending now), and this helped me to settle what I was studying. Keep it up!

  • @sues4370
    @sues4370 Год назад +1

    This was a great visual representation of SGD, thank you!

  • @niklasheise
    @niklasheise 6 лет назад +2

    Its incredible when you display the error and guess values, my next try is to make a learning rate which changes depending on the numbers behind the comma. This tutorial is awesome!!

  • @mkalicharan
    @mkalicharan 6 лет назад +1

    How awesome is this explanation, theory + programming is the way to go Coding train

  • @Manojshankaraj
    @Manojshankaraj 6 лет назад +3

    Really awesome video! Thank you for making machine learning and math so much fun!!

  • @mohammedsaeed7241
    @mohammedsaeed7241 3 года назад

    Dude thank you so much for the intuition! many ppl don't bother going through that

  • @kingoros
    @kingoros 7 лет назад +19

    Thank you for making these! Very informative!

  • @niharika7631
    @niharika7631 6 лет назад +3

    Dan i love how you get so excited to explain things.. So much to say! 😅 super cute. Plus so informative. I m glad I found this channel.

  • @NickKartha
    @NickKartha 6 лет назад +52

    2:35 spoiler for Avengers: Infinity War

  • @sarangchouguley6292
    @sarangchouguley6292 5 лет назад +2

    Thank you Dan. Really you made this topic so easy to understand. Keep up the good work.

  • @juliekell9454
    @juliekell9454 6 лет назад

    Thank you for this. I was taking a coursera course on machine learning and got stuck on week one (incredibly frustrating!!) because half the math instructions didnt make sense. I had no idea it was so simple! I just passed week one. Thank you.

  • @solomonrajkumar5537
    @solomonrajkumar5537 2 года назад

    you are really incredibly awesome teaching Sir!!!!... there is no words say....

  • @gracelungu3646
    @gracelungu3646 7 лет назад

    This channel is really an amazing place to learn high programming algorithm.
    thank you for the videos Mr shiffman.

  • @kwajomensah940
    @kwajomensah940 7 лет назад

    Thank you so much!
    I've been wanting to go over statistics to start diving into ml and mv you've just made my day!

  • @josephkarianjahi1467
    @josephkarianjahi1467 2 года назад

    You are hilarious man! Best teacher on youtube for machine learning

  • @SharonKlinkenberg
    @SharonKlinkenberg 7 лет назад +1

    Great videos Dan keep up the good work. The code really helps getting a handle on the theory.

  • @miteshsharma3106
    @miteshsharma3106 7 лет назад +35

    the snap was cool ..... but we saw the truth in livestream lol😁

  • @arzoosingh5388
    @arzoosingh5388 5 лет назад

    I must say i like the way you teach . You're a nice man God bless .

  • @crehenge2386
    @crehenge2386 7 лет назад

    thank you for showing me how to implement multivariable calculus in programming!

  • @francescozappala8822
    @francescozappala8822 7 лет назад

    Hi, I love your videos...I think they are amazing! I'm Italian and don't understand many words😕 you are great!

    • @TheCodingTrain
      @TheCodingTrain  7 лет назад +1

      Thank you! I need to get more language subtitles!

  • @wengeance8962
    @wengeance8962 7 лет назад +3

    Dan is wearing a funky t-shirt! looks good!

  • @benjaminsmeding8966
    @benjaminsmeding8966 7 лет назад

    Hi Dan, I really enjoy your movies, I'm a self thought programmer and your movies give a real good insight in different kind of algorithms.
    Maybe nice to know... I'm actually a railtrack (P-Way) engineer and we use for example the least square method quite a lot.
    Keep up the great work!
    Ps. If your interested in some actual train datasets (from the Dutch Rail Network) leave a message.

  • @darek4488
    @darek4488 5 лет назад

    You need separate learning rates for m and b. Then set the learning rate for b higher than the one for m so it would rotate faster, but move up and down slower.

  • @fernandonakamuta1502
    @fernandonakamuta1502 6 лет назад

    That is an awesome use of DOM man!

  • @aeroptical
    @aeroptical 7 лет назад

    Sooo impressed by the white board being magically erased! I watched the live stream and thought it would be a total disaster; well, I'm beyond impressed - some fine editing there! :) Loving the ML series so far Dan.

  • @capmi1379
    @capmi1379 7 лет назад

    Wow! machine learning!.. you gave understanding how they work and how they write by line by line without package unlike package like tensor flow XD Wow..thank u

  • @teja2775
    @teja2775 5 лет назад

    Awesome cool..... What a teaching style I really love it you made my day by understanding linear regression with simple story really love you man

  • @zhimingkoh1029
    @zhimingkoh1029 3 года назад +1

    Hey Dan, thank you so much for making all these videos (: You're amazing!

  • @Algebrodadio
    @Algebrodadio 7 лет назад +6

    Are you going over gradient descent because it's used by the back propagation algorithms for neural networks? Because I can't wait to watch you do stuff with NN's.

  • @ElBellacko1
    @ElBellacko1 2 года назад

    great explanation

  • @syedabuthahirkaz
    @syedabuthahirkaz 6 лет назад

    Shiffman is always nice man. Love you Guru !

  • @Contradel
    @Contradel 7 лет назад +1

    So my guess on an explanation on these lines:
    m = m + (error * x) * learning_rate;
    b = b + (error) * learning_rate;
    First line: think about the question "when I change m, how does that affect y?".
    This is what calculus is used for, more specifically differentiation. The answer to the question is written in math as dy/dm, if our line expression is defined as: y = m * x + b.
    dy/dm = D(m * x + b, m) = x. This is why the error should by multiplied by x.
    For the second line same thing! Change of y when changing b?
    dy/db = D(m * x + b, b) = 1. We could multiply error by 1, or leave it out as Shiffman did.
    What does the D function do? It differentiates the expression with regards to the second parameter passed. To calculate this you can either use a calculator, use a lookup table of rules or derive the answer yourself following the proof.

    • @troatie
      @troatie 7 лет назад +1

      This isn't quite right I don't think? Shouldn't you divide by x? Let's say your error was 1. So you want to change y by 1. If you change m by 1 you'll get a change of x! out of that. If you change m by 1/x you'll get the 1 out that you want. Or maybe written out...
      e1 = y - m1 * x - b1
      e2 = y - (m1 + m_change) * x - b1
      if you want e2 to be 0, then you get
      0 = y - m1 * x - m_change * x - b1
      = y - m1 * x - b1 - m_change * x
      = e1 - m_change * x
      m_change = e1 / x

    • @Contradel
      @Contradel 7 лет назад

      I'm not sure I'm following you. But if, for one of the datapoints, the error is 1, you want to adjust the parameters (m and b) a small amount (learning_rate), weighted by error, so that for all your datapoints you get closer to a best fit.

  • @rajcuthrapali800
    @rajcuthrapali800 7 лет назад

    you are like my coding guru
    lol thanks so much mr dan for your help!

  • @varalakshmi3932
    @varalakshmi3932 4 года назад

    Great videos! You are good at making videos by just being yourself and explaining in the best way possible. :))

  • @lakeguy65616
    @lakeguy65616 6 лет назад +1

    velocity in this example doesn't mean speed? but instead means heading?

  • @amitbansode
    @amitbansode 6 лет назад

    All videos by you are rocking

  • @jairajsahgal5062
    @jairajsahgal5062 4 года назад

    you are a good man. thank u

  • @vengalraochowdary4712
    @vengalraochowdary4712 4 года назад

    Really superb explanation of Gradient Descent. Is there any book which you refer or suggest us for Machine Learning ?

  • @junaid1464
    @junaid1464 7 лет назад

    wonderful. nobody can teach better than you.

  • @matteoveraldi.musica
    @matteoveraldi.musica 6 лет назад

    you're the boss. Very good explanation, loved it!

  • @thehappycoder3760
    @thehappycoder3760 2 года назад

    Very helpful

  • @gozumetaklanlar9274
    @gozumetaklanlar9274 7 лет назад

    Hi Dan, greate video. I had watch most of your videos and I would be glad if you could make video about addEventListener and what advantages and disadvantages over onclick, onblur, onmouseover... thank you in advance

  • @nnmrts
    @nnmrts 7 лет назад +27

    Hey Dan! I really like your videos, but sometimes you seem so lonely in that studio. :D
    Wouldn't be something like a co-op coding challenge awesome?

    • @TheCodingTrain
      @TheCodingTrain  7 лет назад +18

      Hah, love this idea!

    • @BinaryReader
      @BinaryReader 7 лет назад +2

      great stuff Dan, this stuff is invaluable for anyone starting out in ML. top stuff.

    • @stefanoslalic2199
      @stefanoslalic2199 6 лет назад

      can you host me?

  • @r.d.machinery3749
    @r.d.machinery3749 5 лет назад

    For a more complete and in depth discussion of Linear Regression with Gradient Descent check out Professor Andrew Ng of Stanford series of machine learning videos: ruclips.net/video/PPLop4L2eGk/видео.html

  • @nikhilnambiar7160
    @nikhilnambiar7160 5 лет назад

    Make a video on lasso regression without library as did for linear regression

  • @PatrickPissurno
    @PatrickPissurno 6 лет назад

    You're really amazing! Thank you so much. Really enjoyed the way you explain things.

  • @jadrima8640
    @jadrima8640 7 лет назад

    Nice tutorial channel!

  • @renelalla7799
    @renelalla7799 6 лет назад +1

    Thank you for your awesome and easy to understand explanations! :) But I have a question regarding the code from 18:08
    Why can we see the line moving instead of being just in its final position? So far, as I can see it in the code, the drawline() method is called after the gradientDescent() method. What am I missing here?

  • @massadian75
    @massadian75 7 лет назад +1

    Very interesting !

  • @dukestt
    @dukestt 7 лет назад +1

    It worked yay haha. I was waiting for it. I was watching at the time though.

  • @jasdeepsinghgrover2470
    @jasdeepsinghgrover2470 7 лет назад

    GREAT Video .. Thanks a Lot

    • @jasdeepsinghgrover2470
      @jasdeepsinghgrover2470 7 лет назад

      had a small doubt, shouldn't the change in slope be error/x instead of error*x?as it is rise / run

  • @Bena_Gold
    @Bena_Gold 6 лет назад +1

    That "come back to me" ... hahahahaha

  • @adaptine
    @adaptine 7 лет назад

    What you're describing here is effectively a kalman filter?

  • @TheNikhilmishras
    @TheNikhilmishras 7 лет назад

    Great videos! :D You are the best!
    Do you recommend going with "Intelligence and Learning" sessions after p5.js introduction for someone who wants to get into Machine learning?

  • @mohammadpatel2315
    @mohammadpatel2315 3 года назад

    The concepts in this are very similar to the perceptron model

  • @OneShot_cest_mieux
    @OneShot_cest_mieux 7 лет назад

    Hello, there is a traduction of your description and your title in french. I live in France and I can't disable this, how to do it please ?

  • @sanjayshr1921
    @sanjayshr1921 7 лет назад

    Awesome

  • @himannamdari7375
    @himannamdari7375 6 лет назад

    I love this video Great Tnx Nice logo on your shirt

  • @frisosmit8920
    @frisosmit8920 7 лет назад

    Maybe it would be cool if you made an AI for a simple game like noughts and crosses with a minimax algorithm

  • @adammontgomery7980
    @adammontgomery7980 5 лет назад

    Would you have two separate learning rates for m and b? Seems like weighting the slope change higher could be beneficial.

  • @8eck
    @8eck 4 года назад

    So the so called steer is the delta of the weights? So called the change of weights in each iteration/epoch?

  • @williamobeng4703
    @williamobeng4703 6 лет назад

    Well explained. it will be nice to see the code. Cant find it on github

    • @TheCodingTrain
      @TheCodingTrain  6 лет назад +1

      github.com/CodingTrain/website/tree/master/Courses/intelligence_learning/session3
      (Need to figure out a way for things to be more findable!)

  • @tigerspidey123
    @tigerspidey123 Год назад

    Would it be possible Applying PID control scheme to the learning rate, so it will accelerate our learning process?

  • @hunarahmad
    @hunarahmad 6 лет назад

    Your snap has inspired Thanos :D

  • @YauheniKisialiou
    @YauheniKisialiou 7 лет назад

    Hey! Great Video! But...how is it possible that the line is self adjusting...According to the code..

  • @gonengazit
    @gonengazit 7 лет назад

    hey, nice video. could you explain why you normalize the values between 0 and 1 and what it does? i tried not normalizing them and i got some really wacky results using gradient descent even though it worked fine with the Ordinary Least Squares method. do you know why that happens?

    • @gonengazit
      @gonengazit 7 лет назад

      Julian atlasovich but it didn't work without normalization

  • @8eck
    @8eck 4 года назад

    So the steer on the graph, it would be vertical line between Yguess and Yactual as a difference?

  • @yogeshpandey9549
    @yogeshpandey9549 4 года назад

    Would you please elaborate the implementation of Gradient Descent Algorithm using vectorization method in python?

    • @TheCodingTrain
      @TheCodingTrain  4 года назад

      Our Coding Train Discord is a great place to get help with coding questions ! discord.gg/hPuGy2g
      - The Coding Train Team

  • @tejasdevgekar1154
    @tejasdevgekar1154 7 лет назад

    Really rookie right now... Gotta progress fast!

  • @lucafilippini1348
    @lucafilippini1348 7 лет назад

    PID ? As always thx Dan...

  • @bosepukur
    @bosepukur 7 лет назад

    cool video

  • @arijitdebnath4480
    @arijitdebnath4480 6 лет назад +1

    why you multiply error * x by learning_rate??

  • @vishwajeetsingh6766
    @vishwajeetsingh6766 6 лет назад +1

    Can someone explain why this is correct ?
    m = m + (error * x) * learning rate;
    I mean how is it dimensionally correct ? Shouldn't error be divided by x so that m can be added to something that is of type m.

    • @michaelho9388
      @michaelho9388 5 лет назад +1

      I agree with you, I feel confused at this part as well.

    • @nkemer
      @nkemer 5 лет назад

      yep I do not understand either.

    • @nkemer
      @nkemer 5 лет назад

      oh it is in the next video.

    • @PeterGObike
      @PeterGObike 4 года назад

      The derivative from first principles allows for that. 2error*x= (error*x)*2(or any small number epsilon)

  • @stefanoslalic2199
    @stefanoslalic2199 6 лет назад +1

    But once again i don't understand how does NN know what the desired output is?
    You are calculating the Loss function based on desired output that you explicitly writing in the system?
    If you explicitly write out the numbers you are more or less telling the Neural Network what to do,
    isnt't the whole concept of neural network to find the way by itself?

    • @TheCodingTrain
      @TheCodingTrain  6 лет назад +1

      Apologies for not making this clear. The technique I'm applying is called "supervised learning" where you have a set of training data with known outputs! The neural network learns how to reproduce the correct results with the known outputs so that it can (hopefully) produce the correct results also with data that doesn't have the answers paired with it. I think I cover this more in my 10.x neural network series.

    • @stefanoslalic2199
      @stefanoslalic2199 6 лет назад

      Got it!
      Thank you for your time and determination, see you in next episode (:

  • @anuraglahon8572
    @anuraglahon8572 6 лет назад +1

    Where is the code ??i am not finding it in github?

  • @charbelsarkis3567
    @charbelsarkis3567 6 лет назад

    I would love to see the snapping of the fingers live :pp

  • @blackdedo93
    @blackdedo93 7 лет назад

    Hey Thanks for The awesome video, i dont understand why not calculate the correct line directly ?

    • @DaSodaPopCop
      @DaSodaPopCop 6 лет назад

      The reason for this is because he is not simply writing a program that finds the correct line. He specifically is writing this program in such a way that implements and showcases the idea of back propagation. Calculating the line directly would be the most efficient way to write this program, but that's not the point of the video. There will be instances with much higher dimensional data where prognostication is much more efficient than doing what you suggest, such as in a Neural Network.

    • @DaSodaPopCop
      @DaSodaPopCop 6 лет назад

      look at 19:14 for his explanation

    • @blackdedo93
      @blackdedo93 6 лет назад

      makes sense Thanks.
      but can u give examples or reference on why would i need this learning process

  • @aasimbaig01
    @aasimbaig01 6 лет назад

    0:05 Hahahah my complete life in 1 question

  • @sreekrishnanr1812
    @sreekrishnanr1812 6 лет назад

    I think you are awesome 😊😊

  • @ac11dc110
    @ac11dc110 5 лет назад

    what is the best book for machine learning?

  • @iftikhar58
    @iftikhar58 2 года назад

    Cost function in this video is mean sqaure error?

  • @iftikhar58
    @iftikhar58 2 года назад

    Brother , Do you have any slack Channel or discord?

  • @souravsarkar5724
    @souravsarkar5724 5 лет назад

    Dear sir, if you give any suggesion to understand that formula : "DELTA_m = error * x " , I will be very greatful .

  • @jeremyheminger6882
    @jeremyheminger6882 5 лет назад

    So, in a way anyone who's made any game with some basic NPC characters have dealt with gradient descent. Example: Trying to get the spaceship to turn and chase the player.

    • @jeremyheminger6882
      @jeremyheminger6882 5 лет назад

      I had previously heard it described as a Mine Craft player walking down hill to find a treasure.

  • @manojdude3
    @manojdude3 5 лет назад

    2:41 Thanos of blackboard writings.

  • @Christian-mn8dh
    @Christian-mn8dh 5 лет назад

    I tried to build a gradient descent algorithm from scratch. Why isn't mine working? Here's my code:
    for i in range(4):
    ypred = m * x + b
    error = (ypred - y) **2
    m = m - (0.001 * error)
    b = b -(0.001 * error)
    m = m.sum()
    b = b.sum()
    #My 'm' and 'b' values decrease infinitely

  • @k3ck3m3n
    @k3ck3m3n 7 лет назад

    I don't get your point, why we should switch to gradient descent. If you think about multidimensional models your linear regression looks like
    Y=X*b+e
    Where Y is a vector,X is a design matrix, b is some vector and e ('error') some random vector with mean(e)=0 and covariance matrix A.
    Now if A is invertible and X fullfills nice enough conditions, then there exists a LQ-estimator for b. And hence we would get our line, which fits the data the best.
    So the reason, why we are doing gradient descent is, because computing inverse matrices is pretty shitty? Or whats the point?
    P.S RUclips commentaries should support Latex :D

    • @troatie
      @troatie 7 лет назад

      I think this is a stepping stone to non-linear optimization. It makes the example simple to just apply it to linear regression.

    • @k3ck3m3n
      @k3ck3m3n 7 лет назад

      Yes, I see what he is going to do with that. But his argument was a little bit sloppy, why we would do that.

  • @adarshr3490
    @adarshr3490 5 лет назад

    You have done the snap even before Thanos have done that =D

  • @ImtithalSaeed
    @ImtithalSaeed 6 лет назад

    why you said x=data[i]*x and y=data[i]*y at 12:20

  • @SuperRueful
    @SuperRueful 6 лет назад

    I'm sure this video can be condensed without losing any real info. I don't have the patience to see it through half way.

  • @xzencombo3400
    @xzencombo3400 7 лет назад

    How old are you and how old you was when u started programming