Discussing All The Types Of Feature Transformation In Machine Learning

Поделиться
HTML-код
  • Опубликовано: 27 ноя 2024

Комментарии • 64

  • @krishnaik06
    @krishnaik06  3 года назад +46

    Please take care everyone.

    • @sarangtamrakar8723
      @sarangtamrakar8723 3 года назад +2

      you too sir.....

    • @shivu.sonwane4429
      @shivu.sonwane4429 3 года назад +5

      Yes take care you and your team
      for people in home isolation 👇🏻
      I've almostj recovered from COVID in home isolation. I'm sharing what helped me recover in case it helps someone.
      • Steam atleast 3 times a day
      • Plenty of fluids: Water (preferably warm), lemonade, coconut water
      • Salt water gargles
      • Vitamin C supplement
      • Plenty of rest
      • Meditation for peace of mind
      • Balanced diet
      • Regain smell: Smell ajwain, kapoor and cloves
      • Lie on your stomach periodically
      Monitor oxygen every 2 hours. Seek medical assistance if it's 92 or below.
      Pls add if I missed anything
      Add ajwain and kapoor into the water while taking steam and drink malvani kadha (Tulsi, adrak, jaggery, lavng, Black paper, ajwain, gavti cha,dalchini)

    • @shivaragiman
      @shivaragiman 3 года назад +1

      @@shivu.sonwane4429 how can we monitor oxygen levels in home

    • @sunilsharanappa7721
      @sunilsharanappa7721 3 года назад

      Using oximetry it measures the oxygen level (oxygen saturation) though it's not very accurate but it's good enough for home.

  • @teegnas
    @teegnas 3 года назад +2

    a very important video to review all feature important techniques at one go ... thanks for uploading!

  • @SALESENGLISH2020
    @SALESENGLISH2020 3 года назад +19

    Pray your team members recover quickly. India needs good teachers.

  • @mostafakhazaeipanah1085
    @mostafakhazaeipanah1085 2 года назад

    What A Useful and Informative Video.
    Most of the ML Courses are based on Algorithms which they forget the importance of Data Preparation

  • @shivaragiman
    @shivaragiman 3 года назад +2

    Get well soon, you people need more to us 👍👍👍👍👍

  • @pseudounknow5559
    @pseudounknow5559 3 года назад +2

    Greetings from Poland

  • @bhargavikoti4208
    @bhargavikoti4208 3 года назад +3

    As usual neatly explained..👍👍thank you for uploading 🙏

  • @giandenorte
    @giandenorte 3 года назад +1

    I am looking for these master krish! Take care too

  • @poojapatil7128
    @poojapatil7128 3 года назад +1

    I have completed my 1-year post-graduation program in data science from a leading institute, but the various techniques I learned from your videos in free, were not even mentioned in the curriculum.
    Thank you for your easy and detailed explanation.

  • @dheerendrasinghbhadauria9798
    @dheerendrasinghbhadauria9798 3 года назад +5

    krish bhai....please upload a PDF of notes of video summary.... along with each video...

  • @nagrajkaranth123
    @nagrajkaranth123 3 года назад +4

    Sir sudhanshu sir tested positive my god please I hope he get well soon

  • @kiyotube222
    @kiyotube222 3 года назад +4

    Get we soon Sudh!!

  • @ashiqhussainkumar1391
    @ashiqhussainkumar1391 3 года назад

    Tbh I don't prefer any lecture series except nptel. But seeing your 20-25 I personally feel this channel is a better resource for practical implementation of ML...
    Initially I didn't subscribe bcz I felt ur profile is looking young and u might not be knowing the way u taught 😁😁😁... Subscribed
    Thanks to you and to Nptel

  • @shubhamkondekar5382
    @shubhamkondekar5382 3 года назад +1

    Krish Naik is best

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад +2

    great explanation

  • @alihaiderabdi9939
    @alihaiderabdi9939 3 года назад +5

    praying for employees of ineuron, inshallah everyone will get well soon.

  • @mdadilhussain2967
    @mdadilhussain2967 3 года назад +2

    I guess that you should first do fit_transform then train_test_split;
    As if you have first splited then according to train data you have calculated mean.
    Then applies same mean for test data, so test data won't have mean as zero.
    Please clear this doubt.

    • @fintech5816
      @fintech5816 2 года назад

      Hi Adil, do you find the answer to your question? If yes, please share.

    • @70ME3E
      @70ME3E Год назад

      from an SO answer:
      "Normalization across instances should be done after splitting the data between training and test set, using only the data from the training set.
      This is because the test set plays the role of fresh unseen data, so it's not supposed to be accessible at the training stage. Using any information coming from the test set before or during training is a potential bias in the evaluation of the performance."

  • @satviksaxena3868
    @satviksaxena3868 3 года назад +2

    Hope the team will recover soon, Take Care !!

  • @ValliammaiMuthaiyah
    @ValliammaiMuthaiyah 5 месяцев назад

    Excellent Sir!

  • @captainmustard1
    @captainmustard1 Год назад

    thank you sir, it is just an amazing video!!

  • @prakashkafle454
    @prakashkafle454 3 года назад +4

    I pray for your team for speed recovery krish . We are also getting worst news day by day here in nepal ...

  • @ajaykushwaha-je6mw
    @ajaykushwaha-je6mw 2 года назад +3

    Hi Krish, while transformation why we are not dividing our data in Train and Test ?

  • @imtiazali-xu8gw
    @imtiazali-xu8gw 7 месяцев назад

    Sir box cox transformation pe ak video banaye

  • @write2ruby
    @write2ruby 2 года назад

    Very Informative

  • @wahabali828
    @wahabali828 2 года назад

    thank you very much sir

  • @yashpandey5484
    @yashpandey5484 3 года назад +1

    Sir weather scalling is required after performing log transformation ??

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад

    finished watching

  • @tanujajoshi1901
    @tanujajoshi1901 3 года назад +2

    Hey Krish, Can you explain Generative Adversarial Networks (GANs) especially the coding part for a dataset other than an image dataset?? It would be of great help.

  • @SomeoneElsesSomeoneElse
    @SomeoneElsesSomeoneElse 2 года назад +1

    With respect to StandardScaler() If you split the dataset prior to scaling the features then don't you risk having skewed features? Put differently, if you train your model to learn that values of 1 get a certain weight and in your test set the data isn't standardized around the same mean as the train set then the model will invariably have worse accuracy unless the train set and test set features have the same mean, right? Shouldn't the test set samples of the full dataset removed only to serve as an "out-of-sample" test? Not two separate datasets?

  • @ayushsingh-qn8sb
    @ayushsingh-qn8sb 3 года назад +1

    If I have applied some encoding technique , do I have to scale them ?

  • @mayurgupta4004
    @mayurgupta4004 3 года назад

    when we are using gaussian transformation that will convert our distribution to gaussian distribution where mean=median or standard gaussian distribution where mean=0 and variance=1

  • @ashutoshtiwari5222
    @ashutoshtiwari5222 3 года назад +2

    Sir app apna dyan rakhiye . 🥺😢

  • @mosart03
    @mosart03 3 года назад +2

    Are we suppose to scale categorical features along with continuous features?

    • @sunilsharanappa7721
      @sunilsharanappa7721 3 года назад +3

      No, you shouldn't scale categorical data.
      If the feature is categorical, it means that each value has a separate meaning, so normalizing will turn this features into something different.
      There are several ways to deal with categorical data:
      a) Integer Encoding: Where each unique label is mapped to an integer.
      b) One Hot Encoding: Where each label is mapped to a binary vector.
      c) Learned Embedding: Where a distributed representation of the categories is learned.
      --Sunil Sharanappa

  • @abhishek_dataman6348
    @abhishek_dataman6348 3 года назад

    Do we require to check this transformation techniques in all binary classification problems?!

  • @sarthakphatate4595
    @sarthakphatate4595 3 года назад +1

    good

  • @priyayadav3990
    @priyayadav3990 3 года назад

    In transformation we transform distribution in Normal distribution.then after transformation we also need to perform Standardisation(Scale down).please tell me if I am wrong.

  • @ishantyagi2701
    @ishantyagi2701 2 года назад +1

    should standardization be applied to whole dataset or after we split into train test data?

    • @Craeson1
      @Craeson1 Год назад +1

      It is generally best to apply standardization to the training set only, and then apply the same scaling to the test set. This is because the test set should represent unseen data, and you want to evaluate the model's performance on the test set as closely as possible to how it would perform on new, unseen data. Applying standardization to the entire dataset before splitting it into training and test sets could result in information leakage, as the model could learn about the test set during training.

  • @Sivaramakrishnanv7
    @Sivaramakrishnanv7 3 года назад

    In the join button, i can see (6 months: ₹283.20) plan. you have not mentioned this plan in that join video.Can you pls explain here sir?

  • @umaanil3344
    @umaanil3344 3 года назад

    Sir what about that 'df_scaled' term?
    I am getting error at that point that df_scaled is not defined... Can you please explain

  • @nishanthviswajith1496
    @nishanthviswajith1496 3 года назад +3

    I know python programming. And I'm learning data science by self-study .. My problem is I have 4 years gap in employment. Will I get job in data science field? Need your suggestions.. I'm 26 yrs old

    • @anandbihari3135
      @anandbihari3135 3 года назад +1

      Same story bro , yes u will get job as data scientist just focus on prep and projects. I took gap for preparation for upsc and rbi. In 2016 I got campus placement in amazon as sde . But after 4 year break and covid scene i started preparing for ds and was fortunate enough to start with Sky as data engineer for 10lpa. So sure u will also get placed

    • @nishanthviswajith1496
      @nishanthviswajith1496 3 года назад

      @@anandbihari3135 skills required for a data engineer??

    • @208gamer4
      @208gamer4 Год назад +1

      ​@@nishanthviswajith1496job lagi bro

    • @208gamer4
      @208gamer4 Год назад

      ​@@nishanthviswajith1496Mca kar Raha hu koi scope hai bro

  • @vidulakamat6564
    @vidulakamat6564 3 года назад

    While doing the transformation, do we need to transform both numerical and categorical (encoded) features or only numerical ones? If target is continuous, do we need to transform that as well?

    • @sunilsharanappa7721
      @sunilsharanappa7721 3 года назад

      No, you shouldn't scale categorical data.
      If the feature is categorical, it means that each value has a separate meaning, so normalizing will turn this features into something different.
      There are several ways to deal with categorical data:
      a) Integer Encoding: Where each unique label is mapped to an integer.
      b) One Hot Encoding: Where each label is mapped to a binary vector.
      c) Learned Embedding: Where a distributed representation of the categories is learned.
      if the Target is continuous. Yes, you do need to scale the target variable if the target variable is having a large spread of values.
      --Sunil Sharanappa

    • @vidulakamat6564
      @vidulakamat6564 3 года назад +1

      @@sunilsharanappa7721 thank you

  • @MdMahmudulHasanSuzan--
    @MdMahmudulHasanSuzan-- 3 года назад

    how can i perform scaling on a k-fold data?

  • @foreignworker-2163
    @foreignworker-2163 3 года назад

    Pray for your team!

  • @moonSTAR1893
    @moonSTAR1893 Год назад

    Hello. Important mistake in this tutorial, so I have to stop watching it.
    Problem: you e.g. use MinMax Scaler on whole X_train with differently scaled variables inside. Let's assume "age" is distributed 18-65 while "fare" goes from 5-2000. Scaling age with the global min/max of the dataset, distorts your features. In this case for age 20 you would get z = X-Xmin/Xmax-Xmin = (20-5)/(2000-5) = 15/1995= 0.0075. Instead in the per-feature scaling with just age you would get z = (20-18)/(65-18) = 0.0426 corresponding to a 5-fold numerical difference. The maximal age of 65 would get z = (65-5)/(2000-5) = 0.03 !!!! Meaning age would have maximal value of 0.03 instead of 1!

  • @venkatraaman4509
    @venkatraaman4509 3 года назад +2

    hai, for eg I have a feature regarding age, height, weight
    now I willing to make the gaussian transformation, here in my case
    ==>logarithm tx makes a good fit for age
    ==>reciprocal tx makes a good fit for height
    the question is may I use both features(applied with age(log tx) & height(reciprocal tx)) for my train data, kindly reply to me, sir

    • @venkatraaman4509
      @venkatraaman4509 3 года назад +1

      @Krish Naik. sir kindly reply me

    • @me_debankan4178
      @me_debankan4178 2 года назад

      yeah , i have a same question , do you have any solution?

  • @pankajkumarbarman765
    @pankajkumarbarman765 3 года назад +2

    1st view 💞💞❤️