Live Discussion On Handling Imbalanced Dataset- Machine Learning

Поделиться
HTML-код
  • Опубликовано: 27 ноя 2024

Комментарии • 91

  • @nothing_to_love
    @nothing_to_love 9 месяцев назад +1

    Appreciate your sharing, sir! 2024 but your VDO is extremely helpful. Thanks, sir.

  • @darant
    @darant 4 года назад +7

    Hi Krish,
    Thank you so much for everything that you are offering us at free of cost!

  • @oo_wais
    @oo_wais 2 года назад

    thank you Krish.. i am working on a project and it had imbalanced target variable. this video really helped me out.

  • @mahmoudbabiker75
    @mahmoudbabiker75 Год назад

    We are ...greatly indebted to you

  • @yogenderkushwaha5523
    @yogenderkushwaha5523 4 года назад +14

    Thank you sir, always feel motivated after seeing your enthusiasm for data science. Learning a lot from you ❤️

  • @DeadTalkLive
    @DeadTalkLive 4 года назад +1

    Good video ♥♥! As a current RUclipsr, I am on the lookout for creative ideas! Nice Job!

  • @abhinavmahajan448
    @abhinavmahajan448 2 года назад

    Thanks for the informative video

  • @eduhomebyshubh5445
    @eduhomebyshubh5445 4 года назад +1

    Best machine learning tutorials sir

  • @str7749
    @str7749 2 года назад

    Thank you v.much !

  • @bhavindedhia9968
    @bhavindedhia9968 4 года назад +1

    Always motivated and encourage me when new video comes

  • @SUPRIYASUMAN-qg7qk
    @SUPRIYASUMAN-qg7qk Год назад

    Hello Krish,
    Could you please share your video on handling data imbalance in deep learning models? It would indeed be a great help.

  • @ankitg200
    @ankitg200 4 года назад

    Very Nice video..the greatest thing is we get to know what is currently used in industry not what is bookish

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад

    finished practicing code

  • @niranjannahak89
    @niranjannahak89 2 года назад +1

    Due to an update in imblearn version fit_sample is throwing error. So i used-- X_train_ns,Y_train_ns = ns.fit_resample(X_train,Y_train)---and it works fine for me..

    • @ammar46
      @ammar46 2 года назад

      Hey, is SMOTETomek taking long time?

  • @diycollection96
    @diycollection96 2 года назад

    Really helpful vdo ... Sir i just want to how u select the arange(-2,3)?

  • @sambitnath9853
    @sambitnath9853 3 года назад

    Always a pleasure to watch your videos sir 👍

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад

    finished watching

  • @tejaswinimanne7539
    @tejaswinimanne7539 Год назад

    Sir can you upload video about how to predict earthquake using Naive bayes

  • @cliffordtarimo1511
    @cliffordtarimo1511 3 года назад

    Learned a lot from this!! Thanks man

  • @jupinsgil
    @jupinsgil 2 года назад

    godlike!

  • @lovejazzbass
    @lovejazzbass 4 года назад +1

    Thank you so much Krish. I have two teachers on RUclips. Krish and Harshit!

  • @lujeinalsheikh8316
    @lujeinalsheikh8316 2 года назад

    Hi Krish, would it be possible to make a video about how class weights are used to perform a node split in a weighted decision tree?

  • @talhasaleem8333
    @talhasaleem8333 2 года назад

    The precision is gone #UnderSampling 🤣 That laugh

  • @sucharitha9365
    @sucharitha9365 4 года назад +1

    Nice explained sir

  • @InovateTechVerse
    @InovateTechVerse 3 года назад

    Thank you so much for the session.

  • @natarajanlalgudi
    @natarajanlalgudi 4 года назад

    Great lecture thanks as always Krish

  • @shylashreedev2685
    @shylashreedev2685 2 года назад

    Dont we have the Gpay facility for joining the membership,since it is asking for card number

  • @seeutube8860
    @seeutube8860 2 года назад

    After applying 'ns', is resplitting of undersampled dataset necessary or not? Here, after balancing, model was used on original train and test dataset.

  • @sameerkumar6431
    @sameerkumar6431 4 года назад

    Hi Krish, can you please make video on multi variant time series forecasting model

  • @tirthadatta7368
    @tirthadatta7368 2 года назад

    Sir, is there any video of imbalanced image dataset handling in CNN? or can we use the basic of this live tutorial for image classification purpose??

    • @new_edition3149
      @new_edition3149 2 года назад

      Did u get good accuracy (precison and recall) on image imbalnaced dataset?

    • @new_edition3149
      @new_edition3149 2 года назад

      Actually i'm also looking the answer for same question

  • @riteshmukhopadhyay6922
    @riteshmukhopadhyay6922 2 года назад +2

    Hello Krish,
    I have been following your data analytics videos throughout. I have completed Live EDA And Feature Engineering Playlists, then I started following this playlist. I am quite over whelmed with sudden introduction to ML and other models which I have no clue about.
    Can you please tell me the playlist I should follow first to get the basic understanding of what you are teaching here?
    Thanks for your effort,

  • @simranade1106
    @simranade1106 3 года назад

    if data are unbalanced in regression problem then how to handle??

  • @heyrobined
    @heyrobined 3 года назад

    random-forest class weights example: 40:00
    undersampling : 43:25

  • @rambaldotra2221
    @rambaldotra2221 3 года назад

    Even after installing imblearn its giving me error Module not found. Can someone help me please, due to this error I am stuck.

  • @AmitSingh-co7ci
    @AmitSingh-co7ci 4 года назад

    Hi Sir, Similar kind of error i am getting while using over sampling method..
    from imblearn.over_sampling import RandomOverSampler
    os = RandomOverSampler(0.5)
    x_train_ns, y_train_ns = os.fit_sample(x_train, y_train)
    print("The number of classes before fit{}".format(Counter(y_train)))
    print("The number of classes after fit{}".format(Counter(y_train_ns)))
    error:
    ---------------------------------------------------------------------------
    AttributeError Traceback (most recent call last)
    in
    2
    3 os = RandomOverSampler(0.5)
    ----> 4 x_train_ns, y_train_ns = os.fit_sample(x_train, y_train)
    5 print("The number of classes before fit{}".format(Counter(y_train)))
    6 print("The number of classes after fit{}".format(Counter(y_train_ns)))
    c:\users\asing053\appdata\local\programs\python\python38-32\lib\site-packages\imblearn\base.py in fit_resample(self, X, y)
    75 check_classification_targets(y)
    76 arrays_transformer = ArraysTransformer(X, y)
    ---> 77 X, y, binarize_y = self._check_X_y(X, y)
    78
    79 self.sampling_strategy_ = check_sampling_strategy(
    c:\users\asing053\appdata\local\programs\python\python38-32\lib\site-packages\imblearn\over_sampling\_random_over_sampler.py in _check_X_y(self, X, y)
    77 def _check_X_y(self, X, y):
    78 y, binarize_y = check_target_type(y, indicate_one_vs_all=True)
    ---> 79 X, y = self._validate_data(
    80 X, y, reset=True, accept_sparse=["csr", "csc"], dtype=None,
    81 force_all_finite=False,
    AttributeError: 'RandomOverSampler' object has no attribute '_validate_data'

  • @md.muntasirulhoque8563
    @md.muntasirulhoque8563 3 года назад

    best sir

  • @Thedark.i
    @Thedark.i 4 года назад +3

    Hey sir, the one question you asked in your Virtual interview "Case study which has 0 or 1 dependent variable and they are further dividied into subcatergory. Can you please answer that question like how we can do it?

  • @netrachinival6249
    @netrachinival6249 3 года назад +2

    hello sir, I am new one machine learning .I didn't get from where should I start .Can you please give any list or syllabus.I was seen all your vt classes ,things could great sir .Thank you so much

    • @adipurnomo5683
      @adipurnomo5683 3 года назад

      Starting learn statistic and probability first

  • @ahiyaahammed3643
    @ahiyaahammed3643 2 года назад

    Hi Krish,
    Thank you for your videos can you please do a session on Handling Imbalanced Image Dataset (Medical dataset if possible)?

  • @virtuous_views
    @virtuous_views 4 года назад +1

    In fraud classification, false negatives should be more important right and that means we should focus on our recall score. Am I correct??

  • @nagnathsatav9978
    @nagnathsatav9978 4 года назад +3

    Hi krish want to know if more than 2 classes imbalanced problem?

  • @taniyabanerjee2609
    @taniyabanerjee2609 3 года назад +1

    I really appreciate what you are trying to do. But it would have been much better if you actually answer the questions raised and also if you could explain why and how something is happening. You ask if it's clear, and I see people asking questions but sadly you avoid all the questions. And sometimes I actually have the same questions raised by others but we have to go to some other tutor and learn about it. But nevertheless you give us content to follow through, thankyou :)

  • @ammarkhan2611
    @ammarkhan2611 4 года назад +1

    What are the default parameters used by a Random Forest Classifer ( Tree Depth, No of Trees Used, No of variables used at each step) in Python ?

  • @sajidchoudhary1165
    @sajidchoudhary1165 4 года назад

    Sir Please makes video on Mathematics behind on SVM Regression, AdaBoost Regression, Gradient Boost Classification

  • @joelbraganza3819
    @joelbraganza3819 4 года назад

    When exactly in the pipeline should the imbalanced data be balanced?
    Is it before we begin any feature analysis, feature selection and other pre-processing techniques? Because many time Outlier analysis and removal methods will rule out some good data points in a variable, counting them as outliers when in fact they are just unbalanced data-points associated with the minority class.
    OR, should we balance the data just before predictive modelling for the sake of getting unbiased models & result? Let me know, please.

    • @nischalneupane3
      @nischalneupane3 3 года назад

      By “handling” imbalanced dataset, you are not really “transforming” the dataset as you would in Feature Engineering pipeline. It is part of Model Training/Tuning. You do not “clean” an imbalanced dataset, and it’s perfectly natural for datasets to be imbalanced.

  • @ARUNADEVIRUIT
    @ARUNADEVIRUIT 4 года назад +2

    Hi sir.. my dataset has 13l records for class a and 2k records for class b.. I tried the same smote, randomundersample,gridsearch,randomforest,decisiontree.. everything but column has less correlation wit the target and I'm getting very less score for class b

  • @ammumammu6677
    @ammumammu6677 3 года назад

    Hi sir ,please do data processing using CLI in machine learning

  • @AlgoTribes
    @AlgoTribes 4 года назад +1

    Krish bhai random forest model ke saath isko kaggle pe upload kijiye naa..kaggle score mein toh bahut upar rahegaa yeh model..kar ke dikhayega..request h please.

  • @hailayteklehaymanot2658
    @hailayteklehaymanot2658 3 года назад

    If the training data has only “negative class”, whereas testing data has both the classes; “negative” and “positive”. what kind of algorithm shall we apply ?

  • @rich007p
    @rich007p 3 года назад

    (y) Great

  • @2728jay
    @2728jay 3 года назад +1

    How to handle imbalance data for multi class classification problem which has only text column as feature

  • @SatyamKumar-dj3jo
    @SatyamKumar-dj3jo 4 года назад +1

    ValueError: Logistic Regression supports only penalties in ['l1', 'l2', 'elasticnet', 'none'], got 11.

  • @SnehaSingh-ts7oi
    @SnehaSingh-ts7oi 3 года назад

    Hi Sir. Why have you not use SVM. I have read its very popular algorithm.

  • @rajputjay9856
    @rajputjay9856 4 года назад +2

    AttributeError: 'NearMiss' object has no attribute '_validate_data' .... Due to the version differnce the error comes sir

    • @rajputjay9856
      @rajputjay9856 4 года назад +1

      This error generally comes when we have freshly installed the library and you need to shut everything off star jupyter notebook once again and you can solve that error

  • @devarakondahimaja8423
    @devarakondahimaja8423 3 года назад

    Difference between smote sampling and adasyn sampling?

  • @bdrcmym
    @bdrcmym 3 года назад

    Sir can you explain,how to create our own model

  • @Bunny2.O
    @Bunny2.O 3 года назад

    For me my model is predicting hight false alrets and low true alrets , I am new to ML, my data is imbalanced can u please suggest which models are good for my data, when I check the model in live .. there also positive class is low num like 15 , negative class is high like 3665.

  • @priyankamehta2827
    @priyankamehta2827 4 года назад

    I have 1244 obs in class a and 244 obs in class b. My algorithm is classifying everything in one class. How should i rectify it? I tried logistic regression, svm, random forest.. Same problem

  • @etsutina9594
    @etsutina9594 3 года назад

    I have a different row for my 17 years data which means, one year have 800 row and other year I have 300 so how can I make the rows similar for my time sires prediction

  • @priyankamehta2827
    @priyankamehta2827 4 года назад

    I have 1244 obs in class a and 244 obs in class b. My algorithm is classifying everything in one class. How should i rectify it?

  • @MechiShaky
    @MechiShaky 4 года назад +1

    Krish sir ,If our targets are regression values instead of classification problem . Then how to examine the data is imbalanced ?

    • @vineet094
      @vineet094 4 года назад +1

      Why will a regression problem have imbalance ? Give me an example? Regression problems do not have imbalance!!!! They are real values.

  • @avibitm
    @avibitm 3 года назад

    @krish: what if there is imbalance in data in one single attributes rather class attributes

  • @vijaykumar-yq7sf
    @vijaykumar-yq7sf 4 года назад +1

    Don’t go to usa on H1 or F1.
    Reason
    1. Forget about green card
    2. Conceltency , and tax will take 50% of ur salary.
    3. Very low savings and expensive.
    4. I have seen people working in software and being homeless for years.
    5. Big problem u cannot bring ur parents for sure.
    6. Parents health insurance is big problem.
    7. Do any small business in India don’t come to usa.
    Sorry for hard words, but it’s truth.

  • @Naveenkumar-xc9ms
    @Naveenkumar-xc9ms 4 года назад

    Hello sir where can i find videos fo spam detection project.

  • @srikeshhp7541
    @srikeshhp7541 4 года назад

    Hi Sir,Please upload a video on detailed explanation to crack Google Summer of Code,Please Sir...Thank You..

  • @omkarsutar578
    @omkarsutar578 4 года назад

    I don't understand, Why this video in feature engineering playlist

  • @deutschvalley3574
    @deutschvalley3574 3 года назад

    How We can save the data after under sampling or over sample ?? If possible kindly sir give me code 👩‍💻

  • @samarendrakumarsinha8898
    @samarendrakumarsinha8898 4 года назад

    LOGIC FOR 1000 MILES /HOUR RAILWAY
    ENGINE

  • @tahirullah4786
    @tahirullah4786 3 года назад

    I got this Error how can I solve it.....ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

  • @utkarshupadhyay1485
    @utkarshupadhyay1485 3 года назад

    why you have uploaded this video on Feature Engineering ->

  • @yajnabopaiah8616
    @yajnabopaiah8616 3 года назад

    The concepts could have been explained better rather just focusing on hands on session with the technique