Machine Learning Tutorial Python - 17: L1 and L2 Regularization | Lasso, Ridge Regression

Поделиться
HTML-код
  • Опубликовано: 25 окт 2024

Комментарии • 196

  • @codebasics
    @codebasics  2 года назад +6

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @bharathis9295
    @bharathis9295 3 года назад +212

    Statquest theory+Codebasics Practical implementation=😍😍😍

    • @codebasics
      @codebasics  3 года назад +55

      ha ha .. nice :) Yes I also like statquest.

    • @gokkulkumarvd9125
      @gokkulkumarvd9125 3 года назад +2

      Exactly!

    • @ItsSantoshTiwari
      @ItsSantoshTiwari 3 года назад +2

      Same😂👌

    • @abhinavkaul7187
      @abhinavkaul7187 3 года назад +7

      @@codebasics BAM!! :P Btw, the way you explained Yolo that was superb, bro!

    • @sandydsa
      @sandydsa 3 года назад +1

      Yes! Minor comment, kindly please switch age and matches won. Got confused at first 😂

  • @AlonAvramson
    @AlonAvramson 3 года назад +8

    I have been following all 17 videos on ML you provided so far and found this is the best resource to learn from . Thank you!

  • @ManigandanThangaraj
    @ManigandanThangaraj Год назад +3

    Nice explanation .. Adding to that
    L2 Ridge : Goal is to prevent multicollinearity and control magnitude of the coefficients
    where highly corelated features can be removed by shirking the coefficients towards to zero not exactly zero , stability and generalization.
    L1 Lasso : Goal is to prevent sparsity in the model by shirking the coefficients exactly to zero , importance in feature selection, preventing overfitting..

    • @r0cketRacoon
      @r0cketRacoon 3 месяца назад

      so, in what cases should we use L1 and L2?

  • @Hari-xr7ob
    @Hari-xr7ob 3 года назад +59

    you should probably change the X and Y axes. Matches won is a function of Age. So, Age should be on X axis and Matches won on Y axis

  • @DrizzyJ77
    @DrizzyJ77 5 месяцев назад

    Bro, you don't know how you've helped me in my computer vision journey. Thank you❤❤❤

  • @gyanaranjanbal10
    @gyanaranjanbal10 Год назад

    Clean, crisp and crystal clear, I was struggling to understand this from a long time, your 20 mins video cleared it in one attempt, thanks a lot💌💌

  • @haintuvn
    @haintuvn 3 года назад +6

    Thank you for your interesting video. As far as I get from the video, L1, L2 regulations help to overcome the overfit problem from Linear regression! What is about other algorithms ( Support vector machine, logistic regression..) , how can we overcome the overfit problem?

  • @tusharsethi2801
    @tusharsethi2801 3 года назад

    One of the best videos out there for Regularization.

  • @bors1n
    @bors1n 3 года назад +1

    thank you a lot, I'm from Russia and I'm student. I watch your video about ML and It helps me to understand better

  • @shashankdhananjaya9923
    @shashankdhananjaya9923 2 года назад

    Couldn't have explained it any simpler. Perfect tutorial.

  • @ajaykushwaha-je6mw
    @ajaykushwaha-je6mw 3 года назад

    Best tutorial on l1 and L2 Regularization.

  • @DarkTobias7
    @DarkTobias7 3 года назад +3

    These are the videos we like!!!

    • @codebasics
      @codebasics  3 года назад

      Thanks DarkTobias. Good to see your comment.

  • @nastaran1010
    @nastaran1010 9 месяцев назад

    best learning with very good explanation. Thanks

  • @atulupadhyay1542
    @atulupadhyay1542 3 года назад +3

    machine learning concepts and practicals made easy, Thank you so much Sir

    • @codebasics
      @codebasics  3 года назад +1

      I am happy this was helpful to you.

  • @nationhlohlomi9333
    @nationhlohlomi9333 Год назад

    I really love your content….. You change lives❤❤❤

  • @javiermarchenahurtado7013
    @javiermarchenahurtado7013 2 года назад +1

    Such a great video!! I was struggling to understand regularization and now it's crystal clear to me!

  • @yash422vd
    @yash422vd 3 года назад +2

    As per the equation y = mX + c, you inter-changed the y & X axis, if I'm not wrong.
    Because you are trying to predict match won(yhat) which is your horizontal line and age(X) is on vertical line.
    Maybe using something unconventional mislead new-learners.
    As X is a horizontal line and y is vertical line, that's what we learned since school time.
    Assigning X & y to axis(as per your explanation) will be great help to learner.
    I hope you are not taking personally. My opologies if so!

  • @phil97n
    @phil97n Месяц назад

    Awesome explanation, thanks.

  • @amruth3
    @amruth3 3 года назад

    Sir your all the vedios are really helpful...Now Iam giving you the feed back of the vedio Iam going to see.This is also an beautiful vedio and Hyperparamter tuning also an very best vedio......God Bless you..u..work hard in getting think to understand in easy manner..

  • @bhavikjain1077
    @bhavikjain1077 3 года назад +1

    A good video to understand the practical implementation of L1 and L2. Thank You

  • @NeekaNeeksz
    @NeekaNeeksz 6 месяцев назад

    Clear introduction. Thanks

  • @nexthome1445
    @nexthome1445 3 года назад +7

    Kindly make video on Feature selection for Regression and classification problem

  • @piyushlanjewar6274
    @piyushlanjewar6274 2 года назад

    That's a really great explanation, Anyone can use this method in real use cases now. Keep it up.

  • @kibesamuel697
    @kibesamuel697 Год назад

    The best of two worlds wow!

  • @koustavbanerjee8195
    @koustavbanerjee8195 3 года назад +3

    Please do videos about XGBoost, LGBoost !! You Videos Are Pure GOLD !!

  • @anvarshathik784
    @anvarshathik784 2 года назад

    achine learning concepts and practicals made easy, Thank you so much Sir

  • @kaizen52071
    @kaizen52071 2 года назад

    Nice video....good lesson......funny enough i see my house address in the dataset

  • @RadioactiveChutney
    @RadioactiveChutney 2 года назад

    Note for myself: This is the guy... his videos can clear doubts with codes.

    • @codebasics
      @codebasics  2 года назад +1

      ha ha .. thank you 🙏

  • @mukeshkumar-kh2fh
    @mukeshkumar-kh2fh 2 года назад

    thank you for helping the DS community

  • @ambujbaranwal9351
    @ambujbaranwal9351 Месяц назад +1

    00:04 L1 and L2 regularization help address overfitting in machine learning
    02:12 Balancing between underfitting and overfitting is crucial for effective model training.
    04:26 Regularization shrinks parameters for better prediction function
    06:47 L2 regularization penalizes the overall error and leads to simpler equations.
    09:14 Filtering and handling NA values in a dataset
    12:02 Dropping NA values and converting categorical features into dummies for machine learning in Python.
    14:28 Understanding the issues of overfitting in linear regression model
    17:00 Regularization techniques like L1 and L2 improve model accuracy.
    19:16 Encouraging viewers to like and share the video

  • @vyduong276
    @vyduong276 Год назад

    I can understand it now, thanks to you 🥳

  • @rohantalaviya136
    @rohantalaviya136 5 месяцев назад

    Really great video

  • @phuonglethithanh8498
    @phuonglethithanh8498 Год назад

    Thank you for this video. Very straightforward and comprehensive ❤

  • @nehareddy4619
    @nehareddy4619 2 года назад

    I really liked your way of explanation sir

  • @ankitmaheshwari7310
    @ankitmaheshwari7310 3 года назад

    Good.model representation is good.hoping some deep knowledge in next video

  • @king1_one
    @king1_one 8 месяцев назад

    good explanation sir and you need appreciation , i am here .

  • @joehansie6014
    @joehansie6014 3 года назад +1

    All your videos are totally great. Keep working on it

  • @Ultimate69664
    @Ultimate69664 2 года назад

    thank you ! this video save my exam :)

  • @leonardomenar55
    @leonardomenar55 2 года назад

    Excellent Tutorial, Thanks.

  • @bruh-jr6wj
    @bruh-jr6wj 9 месяцев назад +1

    I believe the most appropriate imputing method here is to group by the similar type of houses and then fill with the mean value of the group. For example, if the average is, say, 90 m^2, and the home is only a flat, the building area is incorrectly imputed.

  • @davuthdy876
    @davuthdy876 3 года назад

    Thank for your video for sharing to the world.

  • @bryteakpakpavi637
    @bryteakpakpavi637 3 года назад +1

    You are the best.

  • @mohammadrasheed9247
    @mohammadrasheed9247 2 года назад

    Nice Explanation. Also Recommended to play on 2X

  • @PollyMwangi-cp3jn
    @PollyMwangi-cp3jn 7 месяцев назад

    Thanks so much sir. Great content

  • @gouravsapra8668
    @gouravsapra8668 2 года назад +1

    Hi...The equation, shouldn't it be : Theta0 + Theta1.x1 + Theta2.square (x1)+Theta3.cube (x1) rather than Theta0 + Theta1.x1 + Theta2.square (x2)+Theta3.cube (x3) because we have only one x feature ?
    2) the Regularization expression (Lambda part), my understanding is that we should not take "i & n" , rather we should take "j & m" etc. The reason is that in first half of equation, we took "i & n" for number of rows whereas in second half, we need to take number of features, so different parameters should be used.
    Please correct me if my understanding is wrong.

  • @vishvam1307
    @vishvam1307 3 года назад +1

    Nice explanation

  • @dylanloh5327
    @dylanloh5327 2 года назад

    Thank you vm for this video. This is straight-forward and simple to understand!

  • @ravikumarrai7325
    @ravikumarrai7325 3 года назад

    Awesom video....really awesom..

  • @ALLINONETV1
    @ALLINONETV1 3 года назад +3

    Please continue ....

  • @marthanyarkoa9007
    @marthanyarkoa9007 11 месяцев назад

    Thanks so simple ❤😊

  • @anseljanson5171
    @anseljanson5171 3 года назад +4

    Thank you for this video
    why did you drop na value price column even though it had more than 7000 na values wont it affect the prediction??

    • @mkt4941
      @mkt4941 3 года назад +2

      You cannot accurately make an assumption as to what the price is based on the available data, so you have to drop it.

    • @anseljanson5171
      @anseljanson5171 3 года назад

      @@mkt4941 Thanks :)

  • @ayenewyihune
    @ayenewyihune 2 года назад

    Cool video

  • @jongcheulkim7284
    @jongcheulkim7284 2 года назад

    Thank you. This is very helpful.

  • @priyankshekhar2454
    @priyankshekhar2454 3 года назад

    Very good videos by you on each topic..thanks !!

  • @tanishsadhwani730
    @tanishsadhwani730 2 года назад

    Amazing sir thank you so much

  • @ayusharora2019
    @ayusharora2019 3 года назад

    Very well explained !!

  • @sanooosai
    @sanooosai 9 месяцев назад

    thank you great work

  • @aadityashukla8535
    @aadityashukla8535 2 года назад

    good theory!

  • @nikolinastojanovska
    @nikolinastojanovska 2 года назад

    great video, thanks!

  • @joehansie6014
    @joehansie6014 3 года назад

    Simple but powerful😎👍

  • @denisvoronov6571
    @denisvoronov6571 2 года назад

    Nice example. Thank you so much!

  • @MrMadmaggot
    @MrMadmaggot 2 года назад

    First when you apply lasso, you apply it apart from the first linear regression model you made right?
    Which means applying scikit Lasso is like making a linear regression but with regularization or it is applied to the linear regresion from the cell above??
    So what if I use a knn or a forest?

  • @kouider76
    @kouider76 3 года назад

    Just came across this video accidentally simply great thank you

  • @unifarzor7237
    @unifarzor7237 2 года назад

    Always excellent lessons, thank you

  • @analuciademoraislimalucial6039
    @analuciademoraislimalucial6039 3 года назад +2

    Thank you so much teacher

  • @alielakroud1786
    @alielakroud1786 3 года назад +3

    Hi Sir,
    Thanks for all this tutorials in ML.
    I've tried to use this syntaxe above, but when i fit my model the score using trainning data is 0.68 whereas the reg.scores using Test data is just weird.score(X_test,Y_test) =--17761722756.9913
    dummies=pd.get_dummies(df[['Suburb','Type','Method','SellerG','CouncilArea','Regionname']])
    Merge=pd.concat([df,dummies],axis='columns')
    final=Merge.drop(['Suburb','Type','Method','SellerG','CouncilArea','Regionname'],axis='columns')
    final
    2nd part of my question is when i use L1 and L2 Regularization the score seem correct 0.66 and 0.67
    I would also mentionned that when i've used LabelEncoder i find a score test data 0.44 and Trainning data 0.42
    Thanks in advance for your answers

    • @ajgameboy6930
      @ajgameboy6930 Год назад

      Same here, I really don't know what went wrong...

    • @ajgameboy6930
      @ajgameboy6930 Год назад +1

      Hey, quick update, I found out the problem in my scenario... I had filled NaN values of price with mean, which caused the problem... Now that I have dropped 'em, it's working fine... Hope you had also solved the problem (you must've, ur comment is from 2 years back XD)

  • @lazzy5173
    @lazzy5173 Месяц назад

    Summary:
    - L1 regularization helps in feature selection.
    -L2 regularization helps in preventing overfitting.

  • @tjbwhitehea1
    @tjbwhitehea1 3 года назад +2

    Hey, great video thank you. Quick question - what's the best way to find the optimal alpha? Do you do a grid search?

    • @codebasics
      @codebasics  3 года назад +1

      Yes doing grid search would be a way

  • @swL1941
    @swL1941 Год назад

    Great video.
    However, It would have been better if you had provided the justification for assigning Zeros to few NaN values and giving mean to frew records. I know "its safest to assume" butt hen I believe in real world projects we cannot just assume things.

  • @nikhilsingh1296
    @nikhilsingh1296 Год назад +1

    I really love learning from your Videos, they are pretty awesome.
    Just a concern, as in Line 11 we ran a missing value sum code where the Price Stated, 7610 and in the next line that is Line 12, we have dropped the 7610 rows, isn't it?
    Also, what was the other option if we would not have dropped the valued, can we not divide the data set and treat 50 percent of the missing values in Price and as a train dataset by imputing mean, and run the test on the missing price values.
    I am not sure, even if this is a valid question, but I am a bit curious.
    Also, what was the scope for PCA here?

    • @slainiae
      @slainiae 7 месяцев назад

      I agree. The missing 'Price' values could have been estimated using one of the previously presented algorithms.

  • @HA-bj5ck
    @HA-bj5ck 10 месяцев назад +1

    Appreciate the efforts, but there were issues with the foundational understanding. Additionally, the inclusion of dummy variables expanded the columns to 745 without acknowledgement or communication regarding its potential adverse effects to viewers was not expected.

  • @EngineerNick
    @EngineerNick 2 года назад

    Thankyou for this it was very useful :)

  • @adia9791
    @adia9791 2 года назад

    I think one must not use those imputations(mean) before train test split as it leads to data leakage, correct me if I am wrong.

  • @anjalipatel9028
    @anjalipatel9028 8 месяцев назад

    L1,L2 Regularization is valid for regression algorithm only?

  • @armghan2312
    @armghan2312 Год назад

    is there any algorithm using which we can determine the unimportant features in our datasets?

  • @victorbenedict8743
    @victorbenedict8743 3 года назад

    Great tutorial sir.Its a privilege to be a fan of yours.Please sir could you please do a video on steps to carry out when doing data cleaning for big data.Thank you.

  • @sunzarora
    @sunzarora 3 года назад

    Nice video, my question is what will u do so accuracy will jump on this dataset from 67 to 90+?

  • @surbhigulati9350
    @surbhigulati9350 Год назад

    Hello Sir
    why did you noy fill the distance parameter with mean value?

  • @SahilAnsari-gl3xu
    @SahilAnsari-gl3xu 3 года назад +1

    Thank a lot Sir❤️ Very good teaching style (theory+practical)👍

  • @rash_mi_be
    @rash_mi_be 2 года назад +1

    In L2 regularization, how can theta reduce when lambda increases, and increase when lambda decreases?

  • @ajaysaroha2539
    @ajaysaroha2539 3 года назад +1

    Sir,I am fresher & want to make career in finance domain data analyst & I have no any experience in finance domain so how can I gain knowledge in finance domain so pls give some suggestion about it.

  • @m.shiqofilla4246
    @m.shiqofilla4246 3 года назад

    Very nice video sir but at first i hoped you show the plot of scatter plot of the data and how the curve of the L1/L2 regression...

  • @furkansalman7108
    @furkansalman7108 2 года назад +1

    I tried Linear Regression on the same dataset but it scored the same with Ridge and Lasso why?

  • @arjunbali2079
    @arjunbali2079 2 года назад

    thanks sir

  • @daretoschool4113
    @daretoschool4113 3 года назад +1

    Please make video for genetic algorithm

  • @OceanAlves23
    @OceanAlves23 3 года назад +1

    👨‍🎓👏✔, from Brazil-Teresina-PI

    • @codebasics
      @codebasics  3 года назад +1

      Thanks Ocean. I wish you visit Brazil one day (especially Amazon rain forest :) )

  • @soumyopattnaik6787
    @soumyopattnaik6787 3 года назад

    IS it ok to impute with mean such large number of records without any justification? Shouldn't the column be dropped altogether?

  • @junaidlatif2881
    @junaidlatif2881 2 года назад

    Amazing. But how to select best alpha value?

  • @SohamPaul-xy9jw
    @SohamPaul-xy9jw Год назад

    When I am creating dummies, it is showing that the Suburb column is of type NoneType() and no dummies are getting created. What can be the problem?

  • @SGandhi
    @SGandhi 3 года назад

    Can you make a video of ensemble model of using decision tree,knn and svm code

  • @MrCentrax
    @MrCentrax 2 года назад

    So are l1 and l2 polynomial regression models?

  • @JAVIERHERNANDEZ-wp6qj
    @JAVIERHERNANDEZ-wp6qj Год назад

    Maybe in the Cost formula, the indices for summation should be different (in general): for the MSE term the sum should be over the entire training dataset (in this case n), and the sum for the regularization term should run over the number of features or columns in the dataset

  • @bhoomi5398
    @bhoomi5398 2 года назад

    what is dual parameter and please explain what is primal formal & dual

  • @raufurrahmankhan1284
    @raufurrahmankhan1284 3 года назад

    Can we use Lasso for feature selection on classification problems?

  • @nomanshaikhali3355
    @nomanshaikhali3355 3 года назад +1

    Kindly explain Boosting algos!!

  • @Microraptorofmillinea
    @Microraptorofmillinea 3 года назад +1

    what about alpha value and other two parameters ?

  • @nilanjanbanik7509
    @nilanjanbanik7509 3 года назад

    @15:18 Did you mean underfitting? Since if it was overfitting then the score for the training data set should have been 1?

    • @creative2z
      @creative2z 3 года назад +2

      No. He meant overfitting only. As with training sets we are getting much higher score. Its almost remembering the data. But when it comes with testing data score is bad.

    • @nilanjanbanik7509
      @nilanjanbanik7509 3 года назад

      @@creative2z Thanks I see what you mean. But I was expecting the score to be much higher almost close to 1, if the model was overfit, i.e., it's passing through all the training points. I guess, the relative increase in score is the key here.

  • @haneulkim4902
    @haneulkim4902 3 года назад

    Don't we have to one-hot encode Postcode, Propertycount as well since they are actually categorical values instead of continuous values?

  • @gefett
    @gefett 3 года назад

    Thank's for class it's very clearly for me.
    But I had a problem to create a sending file my code from to Kaggle, help me please.