Regularization Part 2: Lasso (L1) Regression

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024

Комментарии • 647

  • @statquest
    @statquest  2 года назад +12

    If you want to see why Lasso can set parameters to 0 and Ridge can not, check out: ruclips.net/video/Xm2C_gTAl8c/видео.html
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @hughsignoriello
    @hughsignoriello 2 года назад +16

    Love how you keep these videos introductory and don't go into the heavy math right away to confuse;
    Love the series!

  • @citypunter1413
    @citypunter1413 5 лет назад +68

    One of the best explanation of Ridge and Lasso regression I have seen till date... Keep up the good work....Kudos !!!

  • @marisa4942
    @marisa4942 2 года назад +38

    I am eternally grateful to you and those videos!! Really saves me time in preparing for exams!!

  • @admw3436
    @admw3436 5 лет назад +15

    My teacher is 75 years old, explained us Lasso during one hour , without explaining it. But this is a war I can win :), thanks to your efforts.

    • @statquest
      @statquest  5 лет назад +3

      I love it!!! Glad my video is helpful! :) p.s. I got the joke too. Nice! ;)

    • @ak-ot2wn
      @ak-ot2wn 4 года назад

      Why is this scenario many times the reality? Also, I check StatQuest's vids very often to really understand the things. Thanks @StatQuest

  • @JeanOfmArc
    @JeanOfmArc 2 месяца назад +9

    (Possible) Fact: 78% of people who understand statistics and machine learning attribute their comprehension to StatQuest.

  • @qiaomuzheng5800
    @qiaomuzheng5800 2 года назад +11

    Hi, I can't thank you enough for explaining the core concepts in such short amount of time. Your videos help a lot! My appreciations are beyond words.

  • @chrisg0901
    @chrisg0901 5 лет назад +22

    Don't think your Monty Python reference went unnoticed
    (Terrific and very helpful video, as always)

    • @statquest
      @statquest  5 лет назад +2

      Thanks so much!!! :)

    • @ajha100
      @ajha100 4 года назад +1

      Oh it absolutely did. And it was much loved!

  • @patrickwu5837
    @patrickwu5837 4 года назад +26

    That "Bam???" cracks me up. Thanks for your work!

  • @Phobos11
    @Phobos11 5 лет назад +263

    Good video, but didn't really explain how LASSO gets to make a variable zero. What's the difference between squaring a term and using the absolute value for that?

    • @statquest
      @statquest  5 лет назад +140

      Intuitively, the closer slope gets to zero, the square of that number becomes insignificant compared to the increase in the sum of the squared error. In other words, the smaller you slope, the square gets asymptotically close to 0 because it can't outweigh the increase in the sum of squared error. In contrast, the absolute value adds a fixed amount to the regularization penalty and can overcome the increase in the sum of squared error.

    • @statquest
      @statquest  5 лет назад +34

      @@theethatanuraksoontorn2517 Maybe this discussion on stack-exchange will clear things up for you: stats.stackexchange.com/questions/151954/sparsity-in-lasso-and-advantage-over-ridge-statistical-learning

    • @programminginterviewprep1808
      @programminginterviewprep1808 5 лет назад +25

      @@statquest Thanks for reading the comments and responding!

    • @statquest
      @statquest  5 лет назад +29

      @@programminginterviewprep1808 I'm glad to help. :)

    • @Phobos11
      @Phobos11 5 лет назад +9

      @@statquest I didn't reply before, but the answer really helped me a lot, with basic machine learning and now artificial neural networks, thank you very much for the videos and the replies :D

  • @perrygogas
    @perrygogas 5 лет назад +173

    Some video ideas to better explain the following topics:
    1. Monte Carlo experiments
    2. Bootstrapping
    3. Kernel functions in ML
    4. Why ML is black box

    • @statquest
      @statquest  5 лет назад +89

      OK. I'll add those to the to-do list. The more people that ask for them, the more I'll priority they will get.

    • @perrygogas
      @perrygogas 5 лет назад +5

      @@statquest That is great! keep up the great work!

    • @gauravms6681
      @gauravms6681 5 лет назад +3

      @@statquest yes we need it please do plsssssssssssssssssssssssssssssssss
      plsssssssssssssssssssssssssssssssssssssssssssssss

    • @InfinitesimallyInfinite
      @InfinitesimallyInfinite 5 лет назад +10

      Bootstrapping is explained well in Random Forest video.

    • @miguelsaravia8086
      @miguelsaravia8086 5 лет назад +1

      Do it for us... thanks good stuff

  • @Jenna-iu2lx
    @Jenna-iu2lx Год назад +2

    I am so happy to easily understand these methods after only a few minutes (after spending so many hours studying without really understanding what it was about). Thank you so much, your videos are increadibly helpful! 💯☺

  • @Jan-oj2gn
    @Jan-oj2gn 5 лет назад +2

    This channel is pure gold. This would have saved me hours of internet search... Keep up the good work!

  • @gonzaloferreirovolpi1237
    @gonzaloferreirovolpi1237 5 лет назад +3

    Hi man, really LOVE your videos. Right now I'm studying Data Science and Machine Learning and more often than not your videos are the light at the end of the tunnel, sot thanks!

  • @anuradhadas8795
    @anuradhadas8795 4 года назад +37

    The difference between BAM??? and BAM!!! is hilarious!!

    • @statquest
      @statquest  4 года назад +1

      :)

    • @SaiSrikarDabbukottu
      @SaiSrikarDabbukottu 10 месяцев назад

      ​@@statquestCan you please explain how the irrelevant parameters "shrink"? How does Lasso go to zero when Ridge doesn't?

    • @statquest
      @statquest  10 месяцев назад

      @@SaiSrikarDabbukottu I show how it all works in this video: ruclips.net/video/Xm2C_gTAl8c/видео.html

  • @takedananda
    @takedananda 4 года назад +1

    Came here because I didn't understand it at all when my professor lectured about LASSO in my university course... I have a much better understanding now thank you so much!

    • @statquest
      @statquest  4 года назад

      Awesome!! I'm glad the video was helpful. :)

  • @alexei.domorev
    @alexei.domorev Год назад +2

    Josh - as always your videos are brilliant in their simplicity! Please keep up your good work!

  • @markparee99
    @markparee99 5 лет назад +1

    Every time I think your video subject is going to be daunting, I find you explanation dispel that thought pretty quickly. Nice job!

  • @ayush612
    @ayush612 5 лет назад +1

    Yeahhhh!!! I was the first to express Gratitude to Josh for this awesome video!! Thanks Josh for posting this and man! your channel is growing.. last time, 4 months ago it was 12k. You have the better stats ;)

    • @statquest
      @statquest  5 лет назад +1

      Hooray! Yes, the channel is growing and that is very exciting. It makes me want to work harder to make more videos as quickly as I can. :)

    • @akashdesarda5787
      @akashdesarda5787 5 лет назад

      @@statquest please keep on going... You are our saviour

  • @naomichin5347
    @naomichin5347 2 года назад +1

    I am eternally grateful to you. You've helped immensely with my last assessment in uni to finish my bachelors

    • @statquest
      @statquest  2 года назад +1

      Congratulations!!! I'm glad my videos were helpful! BAM! :)

  • @sanyuktasuman4993
    @sanyuktasuman4993 4 года назад +8

    Your intro songs reminds me of Pheobe from the TV show "Friends", and the songs are amazing for starting the videos on a good note, cheers!

    • @statquest
      @statquest  4 года назад +2

      You should really check out the intro song for this StatQuest: ruclips.net/video/D0efHEJsfHo/видео.html

  • @clementbourgade2487
    @clementbourgade2487 3 года назад +2

    NOBODY IS GOING TO TALK ABOUT THE EUROPEAN / AFRICAN SWALLOW REFERENCE ????are you all dummies or something ? It made my day. Plus, video on top, congratulation. BAMM !

  • @kitkitmessi
    @kitkitmessi 2 года назад +2

    Airspeed of swallow lol. These videos are really helping me a ton, very simply explained and entertaining as well!

  • @ajha100
    @ajha100 4 года назад +3

    I really appreciated the inclusion of swallow airspeed as a variable above and beyond the clear-cut explanation. Thanks Josh. ;-)

  • @alecvan7143
    @alecvan7143 4 года назад +13

    The beginning songs are always amazing hahaha!!

  • @jasonyimc
    @jasonyimc 4 года назад +5

    So easy to understand. And I like the double BAM!!!

  • @atiqkhan7803
    @atiqkhan7803 5 лет назад +1

    This is brilliant. Thanks for making it publicly available

  • @hareshsuppiah9899
    @hareshsuppiah9899 3 года назад +2

    Statquest is like Marshall Eriksen from HIMYM teaching us stats. BAM? Awesome work Josh.

  • @praveerparmar8157
    @praveerparmar8157 3 года назад +1

    Just love the way you say 'BAM?'.....a feeling of hope mixed with optimism, anxiety and doubt 😅

  • @joaocasas4
    @joaocasas4 Год назад +1

    Me and my friend are studying. When the first BAM came, we fell for laught for about 5min. Then the DOUBLE BAM would cause a catrastofic laughter if we didn't stop it . I want you to be my professor please!

  • @tymothylim6550
    @tymothylim6550 3 года назад +2

    Thank you, Josh, for this exciting and educational video! It was really insightful to learn both the superficial difference (i.e. how the coefficients of the predictors are penalized) and the significant difference in terms of application (i.e. some useless predictors may be excluded through Lasso regression)!

  • @hsinchen4403
    @hsinchen4403 4 года назад +2

    Thank you so much for the video !
    I have watched several your videos and I prefer to watch your video first then see the real math formula. When I did that, the formula became so easier and understandable!
    For instance, I don't even know what does 'norm' is, but after watching your video then it would be very easy to understand!

    • @statquest
      @statquest  4 года назад

      Awesome! I'm glad the videos are helpful. :)

  • @add6911
    @add6911 Год назад +1

    Excelent video Josh! Amazing way to explain Statistics Thank you so much! Regards from Querétaro, México

  • @user-ur2en1zq4f
    @user-ur2en1zq4f 2 года назад +1

    Great people know subtle differences which is not visible to common eyes
    love you sir

  • @luispulgar7515
    @luispulgar7515 5 лет назад +1

    Bam! I appreciate the pace of the videos. Thanks for doing this.

  • @TeXtersWS
    @TeXtersWS 5 лет назад +1

    Explained in a very simple yet very effective way! Thank you for your contribution Sir

    • @statquest
      @statquest  5 лет назад

      Hooray! I'm glad you like my video. :)

  • @RussianSUPERHERO
    @RussianSUPERHERO 2 года назад +1

    I came for the quality content, fell in love with the songs and bam.

  • @AnaVitoriaRodriguesLima
    @AnaVitoriaRodriguesLima 4 года назад +2

    Thanks for posting, my new favourite youtube channel absolutely !!!!

  • @ryanzhao3502
    @ryanzhao3502 5 лет назад +2

    Thx very much. Clear explanation for these similar models. Great video I will conserve forever

  • @arpitqw1
    @arpitqw1 5 лет назад +3

    why can't ridge reduce weight/parameter to 0 like lasso?

  • @joshuamcguire4832
    @joshuamcguire4832 4 года назад +1

    a man of his word...very clearly explained!

  • @TheBaam100
    @TheBaam100 4 года назад +1

    Thank you so much for making these videos! Had to hold a presentation about LASSO in university.

    • @statquest
      @statquest  4 года назад +1

      I hope the presentation went well! :)

    • @TheBaam100
      @TheBaam100 4 года назад +1

      @@statquest Thx. It did :)

  • @kyoosik
    @kyoosik 5 лет назад

    The other day, I had homework to write about Lasso and I struggled.. wish I had seen this video a few days earlier.. Thank you as always!

  • @mrknarf4438
    @mrknarf4438 4 года назад +2

    Great video, clear explanation, loved the Swallows reference! Keep it up! :)

    • @statquest
      @statquest  4 года назад +1

      Awesome, thank you!

  • @pratiknabriya5506
    @pratiknabriya5506 4 года назад +4

    A StatQuest a day, keeps Stat fear away!

  • @simsim2159
    @simsim2159 3 года назад +1

    Your videos make it so easy to understand. Thank you!

  • @PythonArms
    @PythonArms 11 месяцев назад

    Harvard should hire you. Your videos never fail me!
    Thank you for such great content!

    • @statquest
      @statquest  11 месяцев назад

      Thank you very much!!!

  • @mahajanpower
    @mahajanpower 4 года назад +1

    Hi Josh! I am a big fan of your videos and it is clearly the best way to learn machine learning. I would like to ask you if you will be uploading videos relating to deep learning and NLP as well. If so, that will be awesome. BAM!!!

    • @statquest
      @statquest  4 года назад +1

      Right now I'm finishing up Support Vector Machines (one more video), then I'll do a series of videos on XGBoost and after that I'll do neural networks and deep learning.

    • @mahajanpower
      @mahajanpower 4 года назад +1

      StatQuest with Josh Starmer Thanks Josh for the updates. I’ll send you request at Linkedin.

  • @cloud-tutorials
    @cloud-tutorials 5 лет назад +1

    One more use case of Ridge/Lasso regression is 1) When data points are less 2) High Multicollinearity between variables

  • @gregnelson8148
    @gregnelson8148 5 лет назад +1

    You have a gift for teaching! Excellent videos!

  • @stefanomauceri
    @stefanomauceri 5 лет назад +3

    I prefer the intro where is firmly claimed that StatQuest is bad to the bone. And yes I think this is fundamental.

    • @statquest
      @statquest  5 лет назад

      That’s one of my favorite intros too! :)

    • @statquest
      @statquest  5 лет назад

      But I think my all time favorite is the one for LDA.

    • @stefanomauceri
      @stefanomauceri 5 лет назад +1

      Yes I agree! Together these two could be the StatQuest manifesto summarising what people think about stats!

    • @statquest
      @statquest  5 лет назад

      So true!

  • @tiborcamargo5732
    @tiborcamargo5732 5 лет назад +5

    That Monty Python reference though... good video btw :)

    • @statquest
      @statquest  5 лет назад

      Ha! I'm glad you like the video. ;)

  • @Azureandfabricmastery
    @Azureandfabricmastery 4 года назад +1

    Hi Josh, Thanks for clear explanation on regularization techniques. very exciting. God bless for efforts.

    • @statquest
      @statquest  4 года назад

      Glad you enjoyed it!

  • @Endocrin-PatientCom
    @Endocrin-PatientCom 5 лет назад +1

    Incredible great explanations of regularization methods, thanks a lot.

  • @khanhtruong3254
    @khanhtruong3254 5 лет назад +2

    Hi. Your videos are so helpful. I really appreciate you spend time doing them.
    I have one question related to this video: Is the result of Lasso Regression sensitive to the unit of variables?
    For example in the model: size of mice = B0 + B1*weight + B2*High Fat Diet + B3*Sign + B4*AirSpeed + epsilon
    Suppose the original unit of weight in the data is gram. If we divide the weight by 1,000 to get unit in kilogram, is the Lasso Regression different?
    As I understand, the least square estimated B1-kilogram should be 1,000 times higher than the B1-gram. Therefore, B1-kilogram is more likely to be vanished in Lasso, isn't?

  • @hanadiam8910
    @hanadiam8910 2 года назад +1

    Million BAM for this channel 🎉🎉🎉

  • @flyingdutchmanSimon
    @flyingdutchmanSimon 3 года назад +1

    Seriously the best videos ever!!

  • @luisakrawczyk8319
    @luisakrawczyk8319 5 лет назад +7

    How do Ridge or Lasso know which variables are useless? Will they not also shrink the parameter of important variables ?

    • @suriahselvam9066
      @suriahselvam9066 5 лет назад +1

      I am also looking for the answer to this. I'm just using my intuition here, but here's what I think. The least important variables have terrible predictive value so the residuals along these dimensions are the highest. If we create a penalty for introducing these variables (especially with a large lambda that outweighs/similar in magnitude to the size of these residuals squared), decrease in coefficient of these "bad predictors" will cause comparatively smaller increase in residuals compared to the decrease in penalty due to the randomness of these predictors. In contrast, the penalty for "good predictors" (which are less random) will cause significant change in residuals as we decrease its coefficients. This would probably mean that these coefficients would have to undergo smaller change to account for the larger increase in residuals. This is why the minimisation will reduce the coefficients of "bad predictors" faster than "good predictors. I take this case would be especially true when cross-validating.

    • @orilong
      @orilong 4 года назад

      if you draw the curves of y=x and y=x^2, you will find the gradient will vanish for y=x^2 near origin point, hence very hard to be decreased to zero if using optimizing approach like SGD.

  • @corneliusschramm5791
    @corneliusschramm5791 5 лет назад +1

    Dude you are an absolute lifesaver! keep it up!!!

    • @statquest
      @statquest  5 лет назад

      Hooray! I'm glad I could help. :)

  • @ainiaini4426
    @ainiaini4426 2 года назад +1

    Hahaha.. That moment you said BAM??? I laughed out loud 🤣🤣🤣

  • @abdulazizalhaidari7665
    @abdulazizalhaidari7665 3 месяца назад +1

    Great work, Thank you Josh,
    I'm trying to connect ideas from different perspectives/angles, Does the lambda here somehow related to Lagrange multiplier ?

  • @yuzaR-Data-Science
    @yuzaR-Data-Science 5 лет назад +2

    Thanks a lot! Amazing explanation! Please, continue the great work and add more on statistics, probability in general and machine learning in particular. Sinse Data Science suppose to have a great future, I am certain that your channel also will prosper a great deal!

  • @rishabhkumar-qs3jb
    @rishabhkumar-qs3jb 3 года назад +1

    Amazing video, explanation is fantastic. I like the song along with the concept :)

  • @davidmantilla1899
    @davidmantilla1899 2 года назад +1

    Best youtube channel

  • @kd1415
    @kd1415 3 года назад +1

    love the work, i remember reading books about linear regresion, when they spent like 5 pages for these 2 topics but i still have no clue what they really do =))

    • @statquest
      @statquest  3 года назад

      Glad it was helpful!

    • @kd1415
      @kd1415 3 года назад +1

      Love the fact that you reply to every single comment here in YT haha

  • @pomegranate8593
    @pomegranate8593 Год назад +1

    me: wathcing these videos in full panic
    video: plays calming music
    me: :)

  • @shyamparmar983
    @shyamparmar983 5 лет назад +1

    I am sorry but I'm not able to figure out why (regardless of the approach - ridge and lasso), the 'good' parameters 'slope' and 'diet difference' will behave differently than the other two silly ones. I don't understand this since you are applying the same 'lambda' and absolute value for all 4 parameters. It'd be really kind of you to clear my silly doubt. Thanks!

  • @apekshaagrawal6696
    @apekshaagrawal6696 4 года назад +1

    Thanks for the Video. They make difficult concepts seem really easy..

  • @pelumiobasa3104
    @pelumiobasa3104 4 года назад +1

    this is awesome thank you so much for this u explained it so well . I will recommend this video to every one I know who is interested . I also watched your lasso video and it was just as good thank you

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @RenoyZachariah
    @RenoyZachariah 2 года назад +1

    Amazing explanation. Loved the Monty Python reference :D

  • @whispers191
    @whispers191 2 года назад +1

    Thank you once again Josh!

  • @TheHerbert4321
    @TheHerbert4321 4 года назад

    I love your style of explaining! You leave enough time for anyone to take in all information while talking. Sometimes it feels like you are trying to teach little kids, but it actually just works. I often watch other teaching videos and can't remember most of it afterwards, but I can remember almost everything that you are saying after the first time watching. Amazing job!
    I have one question though. You were saying that the regression model in the beginning had low bias and high variance. Does it not have high bias? As far as I know bias represents the expected generalization (or test) error, if we were to fit a very large training set. If we fit that simple model to a lot of data, the generalization error would be rather high, because it could not capture the true patterns in the data.

    • @statquest
      @statquest  3 года назад

      I'm glad you like the videos! In ML, there are specific meanings for bias and variance that are a little bit different from what you are using and I explain in this StatQuest: ruclips.net/video/EuBBz3bI-aA/видео.html

  • @emmanueluche3262
    @emmanueluche3262 Год назад +1

    Wow! so easy to understand this! Thanks very much!

  • @longkhuong8382
    @longkhuong8382 5 лет назад +1

    Hooray!!!! excellent video as always
    Thank you!

    • @statquest
      @statquest  5 лет назад +1

      Hooray, indeed!!!! Glad you like this one! :)

  • @rezaroshanpour971
    @rezaroshanpour971 8 месяцев назад +1

    Great....please continue to learn other models...thank you so much.

  • @abdullahmoiz8151
    @abdullahmoiz8151 5 лет назад +1

    Brilliant explanation
    didnt need to check out any other video

  • @arthurus8374
    @arthurus8374 2 года назад +1

    so incredible, so well explained

  • @sophie-ev1mr
    @sophie-ev1mr 4 года назад

    Thank you so much for these videos you are a literal godsend. You should do a video on weighted least squares!!

  • @zebralemon
    @zebralemon Год назад +1

    I enjoy the content and your jam so much! '~Stat Quest~~'

  • @ninakumagai22
    @ninakumagai22 5 лет назад +1

    My favourite youtuber!

  • @adwindtf
    @adwindtf 4 года назад +1

    love your videos.... extremely helpful and cristal clear explained.... but your songs..... let's say you have a very promising career as a statistician... no question

  • @user-zi5zh5pg5w
    @user-zi5zh5pg5w Год назад +1

    Finally, I found 'The One'!

  • @anjalisetiya3149
    @anjalisetiya3149 2 года назад +1

    ThankYou Josh for another amazing Statquest! I want to know if L1 and L2 regularization terms should be used together or separately What should be the standard approach. I'm specifically using boosting algorithms.

    • @statquest
      @statquest  2 года назад

      Usually they are used together. See: ruclips.net/video/1dKRdX9bfIo/видео.html

    • @anjalisetiya3149
      @anjalisetiya3149 2 года назад +1

      @@statquest thankyou

    • @RAJIBLOCHANDAS
      @RAJIBLOCHANDAS 2 года назад

      Good question! that combination is called Elastic Net regularization.

  • @UncleLoren
    @UncleLoren 4 года назад

    Both the Ridge and Lasso videos made me want to cry. (Know you aren't alone if anyone else feels the same.) Also noteworthy: Ridge Regression avoids problems introduced by having more observations than predictor variables, or when multicollinearity is an issue. This example avoids either condition. Triple Bam. (Obviously, I am taking the definition too literally. It's a relative statement, re: the vars to observations ratio. )...Nevertheless, there's no end to my confusion. I was approaching "understanding", using the ISLR book....but you can actually get two different perspectives on the same topic, and then be worse off due to variance in how the concepts are presented. That said, you're still awesome, StatsQuest, and you are invited to play guitar at my funeral when I end things from trying to learn ML.

    • @UncleLoren
      @UncleLoren 4 года назад

      (gonna check the StatsExchange link down below that you provided. Thank you, sir!!!)

  • @luigineri4364
    @luigineri4364 5 лет назад

    Hi thanks I think the videos are great and also I like your songs. You are very talented.
    In looking at this video I was thinking that Lasso regression can be used as a form of variable selection. Is this a good idea? So basically at first you include all the predictors then the lasso will tell which variable you need to get rid off. Does this make sense?

  • @lanchen5034
    @lanchen5034 5 лет назад +2

    Thanks very much for this video, it really helps me with the concept of the Ridge Regression and the Lasso Regression. I have a silly question: why the parameter in the Ridge Regression cannot shrink to zero but in Lasso, they can?

    • @statquest
      @statquest  5 лет назад +1

      That's not a silly question at all, and there are lots of websites that dive into that answer. I'd just do a google search and you should find what you're looking for.

    • @jordanhe5852
      @jordanhe5852 2 года назад

      this also make me muddle

  • @hUiLi8905
    @hUiLi8905 4 года назад

    I have seen some articles mentioning that Ridge Regression is better in handling multicollinearity between variables as compared to Lasso. But i am not sure of the reason why. Since the difference between Lasso and Ridge is just the way it penalized the coefficients.

  • @johnholbrook1447
    @johnholbrook1447 5 лет назад +1

    Fantastic videos - very well explained!

  • @MrArunavadatta
    @MrArunavadatta 4 года назад +1

    wonderfully explained

  • @lucianotarsia9985
    @lucianotarsia9985 3 года назад +1

    Great video! The topic is really well explained

  • @simrankalra4029
    @simrankalra4029 5 лет назад +2

    Thankyou Sir ! Great Help.

  • @yilinxie2457
    @yilinxie2457 5 лет назад

    Thanks! I finally understand how they shrink parameters!

  • @DonnyDonowitz22
    @DonnyDonowitz22 5 лет назад

    The best explanation ever.

  • @jamalshah2210
    @jamalshah2210 2 года назад

    Thank you Josh for sharing this video. Could you please do a video on Bayesian statistics and Monte Carlo methods?

    • @statquest
      @statquest  2 года назад

      I hope to do bayesian statistics soon.

  • @rakeshk6799
    @rakeshk6799 Год назад +1

    Is there a more detailed explanation as to how some feature weights become zero in the case of Lasso, and why that cannot happen in Ridge? Thanks.

    • @statquest
      @statquest  Год назад

      Yes, see: ruclips.net/video/Xm2C_gTAl8c/видео.html

    • @rakeshk6799
      @rakeshk6799 Год назад

      @@statquest Thanks! I watched the video, but I am still not sure why there is a kink in the case of Lasso. What exactly creates that kink?

    • @statquest
      @statquest  Год назад

      @@rakeshk6799 The absolute value function.

  • @qorbanimaq
    @qorbanimaq 5 лет назад +1

    Ah! A triple THANKSSSS!!!!. I finally got what they are really doing.

  • @gdivadnosdivad6185
    @gdivadnosdivad6185 10 месяцев назад +1

    You are the best! I understand it now!

  • @yellowburros123
    @yellowburros123 5 лет назад

    Awesome video! Follow up question - in your examples, the initial models all overestimated the impact of weight on size, and ridge/lasso regression corrected it by picking a new model with lower coefficients for weight. What happens if the initial model under-estimates the coefficient for weight? Would ridge/lasso introduce more variance by lowering the coefficients further?

    • @statquest
      @statquest  5 лет назад +1

      Lasso and Ridge Regression both use cross validation, so that means that they use a wide variety of "training" data to make the estimates - meaning that they will sometimes underestimate and sometimes overestimate. The Lasso and Ridge regression help them find the happy medium.

    • @yellowburros123
      @yellowburros123 5 лет назад +1

      ​@@statquest Got it! So it seems that Lasso and Ridge regressions can actually increase the coefficient values sometimes?

    • @statquest
      @statquest  5 лет назад +1

      @@yellowburros123 Depending on the model, Ridge and Lasso can increase the coefficients for some variables. This is illustrated in the Introduction to Statistical Learning in R www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf on page 216.

    • @yellowburros123
      @yellowburros123 5 лет назад +1

      @@statquest , really appreciate it!

  • @user-lq1on3ls8b
    @user-lq1on3ls8b 10 месяцев назад +1

    at the end of the last video [regularization part 1: ridge (l2) regression], you mentioned to solve the problem that how to estimate 10000 parameters with 500 samples and will talk about it in the next one, and after finishing this video I was still wondering how to deal with it...🤣🤣 am i looking at these videos in the wrong order or what?

    • @statquest
      @statquest  10 месяцев назад

      You have the correct order. Unfortunately, all I have had time to do is provide a general intuition on how cross validation is used to find an optimal line, even when we don't have enough data.

  • @annd6920
    @annd6920 5 лет назад +1

    Hi Josh, could you please make video(s) on Correspondence Analysis, Chi-Square distance and Multiple contingency table?
    Learning from your videos are much efficient than several books combined :) !

    • @statquest
      @statquest  5 лет назад +1

      I'm working on those, but the bad news is that they will not be ready for a long time.... :(