Encoding Categorical Data | Ordinal Encoding | Label Encoding

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024

Комментарии • 95

  • @dishitvasoliya9033
    @dishitvasoliya9033 Год назад +50

    I purchased a data science course with around 50k fees but even that they are not teaching this level education. You are such fabulous person.. 👍

    • @indra-zd9zu
      @indra-zd9zu 20 дней назад

      50K pani main gaye chapak

    • @nrted3877
      @nrted3877 15 дней назад

      bhai 50k ka koi course khardta hai kya koi

  • @katadermaro
    @katadermaro 3 года назад +27

    wow I was so confused about column transformer and why everyone is using that. I was so confused. People usually include that in the encoding videos without any explanation.
    You are the first person to explain it separately in your series. I am amazed. Thank you Nitish, I will remember you throughout my journey.

  • @mridang2064
    @mridang2064 2 года назад +8

    Never knew about Label encoder and Ordinal encoder, I used to apply label encoder on input features, thanks for this hidden insight Nitish Sir.

  • @paragvachhani4643
    @paragvachhani4643 Год назад +2

    Sir kya bolo...just itna hi
    U r doing great job...with quality conceptual clearity...

  • @hamzayaseen9963
    @hamzayaseen9963 Месяц назад

    This is a great channel. I'm glad I found it. Thank you so much, Sir, for making this so simple.

  • @ajaykushwaha4233
    @ajaykushwaha4233 3 года назад +4

    Best explanation ever 🙏🏻

  • @arhaanahmad3953
    @arhaanahmad3953 Месяц назад

    Well explained. This really helped me to improve my understanding of ML. Thank you sir.

  • @a_wise_person
    @a_wise_person Год назад +2

    The way you teach is amazing sir , i was trying for months to learn ML , finally i am glad that i found you .

  • @osho_magic
    @osho_magic Год назад

    M first time comment kar rha ,, Kosi channel p because info is really precious ,,, quality bole to Nitish sir

  • @sneharj2036
    @sneharj2036 2 года назад +1

    Thanku so much for clearing concepts of encoding technique with example. Very helpful n informative video.

  • @santanubag358
    @santanubag358 Год назад

    You And Krish Naik Sir are the Brahma And Bishnu Of Data Science.

  • @marikhalid6474
    @marikhalid6474 26 дней назад +1

    you are great bro
    bestest video content

  • @muhammadtayyabtahirqureshi7186
    @muhammadtayyabtahirqureshi7186 Год назад +1

    explicit and to-the-point 👍

  • @arpittrivedi6636
    @arpittrivedi6636 Год назад

    Kabhi-2 main sochta hu agar aap nahi hote to hamara kya hota. Great explanation

  • @Dsutradhar
    @Dsutradhar Год назад

    I dont know why this channel is not famous

  • @devilsworld7299
    @devilsworld7299 2 месяца назад +2

    one quick question sir we can do this isntead of these sklearn function this way we can arrange and give orders to our data and its fast too easy to understand instant output
    df.education[df['education'] == 'School'] = 0
    df.education[df['education'] == 'UG'] = 1
    df.education[df['education'] == 'PG'] = 2
    df.review[df['review'] == 'Poor'] = 0
    df.review[df['review'] == 'Average'] = 1
    df.review[df['review'] == 'Good'] = 2
    df.purchased[df['purchased'] == 'Yes'] = 1
    df.purchased[df['purchased'] == 'No'] = 0

    • @positivevibes2714
      @positivevibes2714 19 дней назад

      Instead of doing this you can use pandas Map function it'll do same thing

  • @yogeshsapkal2593
    @yogeshsapkal2593 2 года назад +1

    sir hamane classes karake bhi hamko yeh concept nahi sikhaee...thank you sir

  • @siyays1868
    @siyays1868 2 года назад

    Thanku so much for clearing encoding concepts. Very good explaination with example.

  • @zkhan2023
    @zkhan2023 3 года назад +2

    Sir, you are doing a great job

  • @saumyashah6622
    @saumyashah6622 3 года назад +4

    "Whenever we are doing a project, instead of train_test_split, we should always do k-fold cross validation." Sir, is my thinking correct ?? If wrong, please rectify me.

  • @alimuiz5328
    @alimuiz5328 Месяц назад

    Thank you for the great video, sir.
    I wanted to ask wouldn't it be better to encode the data before splitting it? This way we don't have to transform the train and test sets individually.

  • @sumitb2015
    @sumitb2015 2 года назад +1

    Excellent explanation 👍

  • @debasissahoo7559
    @debasissahoo7559 9 месяцев назад

    You are great efforts 👌 a appreciate you god bless ❤

  • @_Mahesh-nh7xv
    @_Mahesh-nh7xv 2 месяца назад

    Best explanation ever

  • @user-wk8fh2ub8b
    @user-wk8fh2ub8b 10 месяцев назад

    You Are Really Great Sir

  • @narendraparmar1631
    @narendraparmar1631 8 месяцев назад

    Great Content
    Thank You😀

  • @SACHINKUMAR-px8kq
    @SACHINKUMAR-px8kq Год назад

    Thanks Sir for this Amazing Session

  • @kushagalashravanthi-go3sg
    @kushagalashravanthi-go3sg Год назад

    Super explanation sir❤

  • @heetbhatt4511
    @heetbhatt4511 11 месяцев назад

    Thank you sir

  • @arman_shekh97
    @arman_shekh97 3 года назад

    maine socha ajj video nhi ayegi but thank you

  • @talkswithRishabh
    @talkswithRishabh 2 года назад

    Too good content sir it is helping me alot

  • @aditirawat9841
    @aditirawat9841 2 года назад

    recommend these tutorials to aspiring data scientist

  • @sandipansarkar9211
    @sandipansarkar9211 Год назад

    finished watching and coding

  • @user-px7de6up2m
    @user-px7de6up2m 7 месяцев назад

    sir plz make a video on high cardinality categorical value

  • @meenalpande
    @meenalpande Год назад

    Nice explanation

  • @chetanchavan647
    @chetanchavan647 Год назад

    Best

  • @HimanshuSharma-we5li
    @HimanshuSharma-we5li 2 года назад +1

    It would be great if there is dataset link in aal the videos.

  • @Sumitrawat112
    @Sumitrawat112 Год назад +1

    can we perfom label encoding and oridinal encoding before train test split

  • @lol-ki5pd
    @lol-ki5pd Месяц назад

    oe = OrdinalEncoder(categories=[['Poor','Average','Good'],['School','UG','PG']])
    when we have this already defined, so why we need to do oe.fit(X_train) I mean, how will it acutally help when all the calculation was done on oe in first line?

  • @geethanshr
    @geethanshr 2 месяца назад

    At 16:29 why didn't we convert our transformed numpy array to dataframe?

  • @saakshidikshit
    @saakshidikshit 6 месяцев назад

    Can somebody explain me what order should be followed while doing any ML Project. Like whether feature scaling should be applied first or encoding categorical data should be done etc. Would be extremely grateful if someone can please clarify. Thanx.

  • @evergreenonce5456
    @evergreenonce5456 Год назад

    11:18 *Encoding to Categorical Features*

  • @MuhammadJunaid-yr8jd
    @MuhammadJunaid-yr8jd Год назад

    thank you so much

  • @ajitchaturvedi4052
    @ajitchaturvedi4052 Год назад

    Please make one vedio on neural architecture search

  • @kingR-p6n
    @kingR-p6n 4 дня назад

    ValueError: Shape mismatch: if categories is an array, it has to be of shape (n_features,). Im getting this error after I run oe.fit(X_train) can any one help me to solve this problem

  • @sid_x_18
    @sid_x_18 9 месяцев назад

    Why do we even do Label Encoding on target column . I mean that is essentially just 0s and 1s right ? So why we just can’t create dummies ? What’s the logic behind using Label Encoding here ?

  • @mohitkushwaha8974
    @mohitkushwaha8974 Год назад

    Doubt
    1. Can't we use ordinal encoding and label encoding before X train and Xtest split????
    It would have been an easy task to do the encoding before its split.
    2. Cant we use replace function of pandas like replace yes and no to 1 and 0, and replace poor , avg and good to some value like 0, 1 2

  • @user-qp9fj3vv8n
    @user-qp9fj3vv8n 6 месяцев назад

    Hello sir, which lecture has the introduction to sk learn library?

  • @manikantareddy298
    @manikantareddy298 Год назад

    What if there are null values in education column and then how should we start the process?

  • @tusharkhatri5795
    @tusharkhatri5795 Год назад

    I have one doubt during train test split we are fitting on training data while transforming both training and testing suppose this was standardization case then if we fit of train data we get mean and variance of that how can we transform test data using this train data mean and var . I just mean test data should be independent of train data there shouldnt be any type of relationship between them to prevent data leakage . So we must calculate seperate mean and variance for both train and test and fit tranform individually? Pls clarify

  • @arshad1781
    @arshad1781 3 года назад

    zy sub samjh aey gia but need a video after Encoding us py Analysis kesi kry ge aur fine result ko kesi again male female or yes and no mi change kry gy, after 2 or 3 video bad uni video py practical project video bi bny, problem zy ha transform data ho gye ab usi py analysis kesi kry? final output kesi pta chly zy male ha?

  • @promitdutta3029
    @promitdutta3029 10 месяцев назад

    why label encoding can't used to transform input columns ?

  • @ParallelUniverse550
    @ParallelUniverse550 7 месяцев назад

    In label encoding how would the object know whether to map 0 to 'NO ' and 1 to 'YES'. As we didnt specify.

  • @user-wj8my7hw9x
    @user-wj8my7hw9x 8 месяцев назад

    Does it matter if the output column is ordinal or nominal before applying label encoding? How to do encoding of categorical feature column with high cardinality? Please help me

  • @maramreddysrikanth5464
    @maramreddysrikanth5464 10 месяцев назад

    when ordinalencoding or onehotencoding done using coloumn transformer the output array columns index are changed i mean encoding done on 5th coloumn after tranformation it is appering to be 1st in array after transformation any solution

  • @Star-xk5jp
    @Star-xk5jp 7 месяцев назад

    day2-date:10/1/24

  • @yashjain6372
    @yashjain6372 Год назад

    loved it

  • @piyushnirwan6298
    @piyushnirwan6298 3 года назад +1

    don't we have to convert the array output in dataframe after transformation is done

    • @campusx-official
      @campusx-official  3 года назад

      Not required

    • @shreejanshrestha1931
      @shreejanshrestha1931 3 года назад

      I think sir did in previous videos just be make us visualize the numpy array into the better form.

  • @kamilshaikh1602
    @kamilshaikh1602 2 года назад

    what to do if the number of features are high (ordinal ones)? I have 40 such features

  • @taruchitgoyal3735
    @taruchitgoyal3735 Год назад

    Hello Sir,
    Thank you for the session. Can we extend concept of ordinal encoding on numeric column such as Age?
    Like in your dataset at 11.45, the values of column Age are: - 98, 16, 53, 69, 77.
    With more number of records we will have more number of distinct values under the column and at maximum we can have 100 values.
    Thus, if we classify the numeric values into categories will that not help to make our data analysis and ML model better?
    For example: We can have a category: Teenager for all Age values from 13 to 19., College students: 20 to 23, Young professionals: 24 to 30, Mid age: 31 to 65 and Senior citizen: 66 to 99. And then finally apply Ordinal encoding on these categories since now we will have order among the classified values.
    It would be very helpful sir to seek your views on the above.
    Thank you

  • @akshatbhoir1072
    @akshatbhoir1072 Год назад

    Sir if there are yes/no data in data then which encoding should be used?
    Please clear my this doubt

  • @MRAgundli
    @MRAgundli 3 месяца назад

    done

  • @tejaskamble8731
    @tejaskamble8731 6 месяцев назад

    ❤🔥🔥

  • @monikrayu2546
    @monikrayu2546 Месяц назад

    bol sakte hai sir 3:02

  • @tarunchauhan2339
    @tarunchauhan2339 Год назад

    in ordinal encoding an error is raised: Shape mismatch: if categories is an array, it has to be of shape (n_features,)
    can any one resolve please

  • @satyampandey8650
    @satyampandey8650 3 года назад

    Sir then which encoder we should apply on feature which are not ordinal

  • @subhajitdey4483
    @subhajitdey4483 Год назад

    Sir what will happen if the output is categorical data but nominal, should I apply Label Encoding there also...?? Actually I want to say that If the output data is categorical, may be that Nominal / Ordinal, in both of case should I apply Label Encoding....??
    Thank you for this video🙂

  • @aj_ai
    @aj_ai Год назад +1

    👾👾👾

  • @darshedits1732
    @darshedits1732 9 месяцев назад

    sir csv file are not download
    please help me urgent

  • @user-vh2pd7us9z
    @user-vh2pd7us9z Год назад

    how to download dataset from your Github ,it is showing "raw file download" and not downloading
    please help anyone

  • @ajaykushwaha-je6mw
    @ajaykushwaha-je6mw 3 года назад

    I got he concept but all information are in array, do we need to convert them into DF and merge to proceed further ?

  • @annyd3406
    @annyd3406 2 года назад

    11 20 to 12 10 - why column transformer

  • @osho_magic
    @osho_magic Год назад

    Jitni tareef ki Jae Kam h . ..

  • @tradingbrothers1126
    @tradingbrothers1126 Год назад

    kaggle pay nhi milra

  • @harshkondkar3193
    @harshkondkar3193 2 года назад

    How to deal with the situation where there are unseen categories in the test data?

    • @rachitsingh4913
      @rachitsingh4913 Год назад

      its always good to apply encoding without train test split .

    • @anjushac9307
      @anjushac9307 10 месяцев назад

      The encoders have additional parameters that you can set to decide what to do incase unseen categories are encountered in the test data. You can check the documentation for more details

    • @harshkondkar3193
      @harshkondkar3193 10 месяцев назад

      @@anjushac9307 will check the doc. Thanks!!

  • @tradingbrothers1126
    @tradingbrothers1126 Год назад

    sir data set upload kar o

  • @harshmishra7774
    @harshmishra7774 2 года назад

    Engg branch should be the example of ordinal data 🤣

  • @user-vh2pd7us9z
    @user-vh2pd7us9z Год назад

    Please help anyone

  • @1981Praveer
    @1981Praveer 2 года назад

    Q. If we have a big dataset. let's say Housing_price.csv(from Kaggle), then how would I know which column has ordinal data? is there any API to check? @CampusX #CampusX

  • @ajaykushwaha4233
    @ajaykushwaha4233 3 года назад +3

    Best explanation ever 🙏🏻

  • @Ganeshjadhav2808
    @Ganeshjadhav2808 2 года назад

    thank you sir