Implementing machine learning in Python|How to Implement Machine Learning In Python

Поделиться
HTML-код
  • Опубликовано: 8 янв 2025

Комментарии • 80

  • @rameshmamilla5392
    @rameshmamilla5392 5 лет назад +7

    Very well explanation Aman, Thanks! I have some doubts, could you please explain me.
    1. Why the data has missing values & outliers? What are the possibilities that occurring these in that data?
    2. So, whichever outliers presence in the data, will always replace with 99 percentile only?
    3. In formula of finding outliers , why the value (1.5) we are using, why don't the other value?

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад +8

      +ramesh mamilla Hi Ramesh, all these are good questions. Thanks for asking.
      1. There might be n numbers of reasons for missing data and outliers etc. to give some example lets say you capture data from some sensor or IOT device. In these cases device might not work properly at times and hence data captured can b missing or wrong. Other example would be no entry in system itself for example when u r given a form to fill , you only fill mandatory fields:)
      2. It can be 95 and 5 or other values as well depends on distribution etc
      3. 1.5 is a defined boundary given by statisticians and is widely accepted as defining boundaries for outlier detection.

    • @rameshmamilla5392
      @rameshmamilla5392 5 лет назад +5

      Yeah agree Aman! Thank You to put your time on this.

  • @milisneha7068
    @milisneha7068 5 лет назад +1

    Very nice and clear explanation on model building, Aman. Waiting for next consecutive video :-D Thank you. :)

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад +1

      +Mili Sneha Thank you Sneha. Yes next steps of model building and deployment is planned.

  • @diwakermishra1749
    @diwakermishra1749 Год назад

    Thanks continues from my side.....

  • @GSds657
    @GSds657 Год назад

    very good

  • @abhishekgautam231
    @abhishekgautam231 5 лет назад +1

    Aman - I am eagerly looking forward to the questions you raised at the end of the video. Thanks!

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад +1

      +Abhishek Gautam Thanks Abhishek. Keep me posted with doubts as well.

    • @abhishekgautam231
      @abhishekgautam231 5 лет назад

      @@UnfoldDataScience Absolutely, Aman.

  • @kartikbera3799
    @kartikbera3799 4 года назад

    Great explanation

  • @GopiKumar-ny3xx
    @GopiKumar-ny3xx 5 лет назад

    Useful information through nice presentation...

  • @gopiramankumar2231
    @gopiramankumar2231 5 лет назад

    Good show and all the best Aman... awaiting for next video

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад

      +GOPI RAMAN KUMAR Thanks a lot. Yes next video is on the way :)

  • @sudeeplabh52
    @sudeeplabh52 5 лет назад

    Useful Information. Thank you .

  • @bbrocks5530
    @bbrocks5530 4 года назад +2

    Could you please add all your videos to their respective playlists? That will help us better.. thanks for good work.. :)

    • @UnfoldDataScience
      @UnfoldDataScience  4 года назад

      Sure, Please visit playlist tab on my channel and you will find video playlist under various topics. Let me know if you are looking for anything more particular. Thank you.

    • @bbrocks5530
      @bbrocks5530 4 года назад

      @@UnfoldDataScience yes, but there are videos which are not linked to any of the playlists, so not sure about its order

  • @sudeepkumar8518
    @sudeepkumar8518 5 лет назад

    Very useful video

  • @kuppuswamyr4360
    @kuppuswamyr4360 4 года назад +1

    Thanku soo much sir

  • @preranatiwary7690
    @preranatiwary7690 5 лет назад

    Great content again! 👌👌

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад

      +Prerana Tiwary Thanks a lot for your great feedback always.

  • @domnikridershorts5873
    @domnikridershorts5873 5 лет назад

    Great things Aman.
    Please create the deployment also. Simply deployment. Small request

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад +1

      +Nikhil Reddy Hi Nikhil, yes next is model building steps and then deployment of the same model. Stay tuned.

  • @kirtisardana8479
    @kirtisardana8479 4 года назад

    Hi Aman ...at 6:43 ...you said distance in age value between 75 and 100 percentile is high ....isn’t same true with expense which is also double ?

    • @UnfoldDataScience
      @UnfoldDataScience  4 года назад

      Hi Kirti, good observation, yes if it double then it is relatively more. We should also look at other percentiles. For example - 25th, 50th etc.

  • @LahariJakkireddy
    @LahariJakkireddy Год назад

    It is understandable easily,and Can I know what are the libraries of python we are mostly using in machine learning like numpy

    • @UnfoldDataScience
      @UnfoldDataScience  Год назад

      sure you will use packages like, pandas, numpy, matplptlib, seaborn, tensorflow, pytorch, sklearn, scipy etc

  • @ehtashamnaseer7380
    @ehtashamnaseer7380 4 года назад +1

    It's an excellent lecture. Can you give a tutorial on geographically weighted regression model implementation in python? It will be a great lecture

  • @ravneetkaur7278
    @ravneetkaur7278 2 года назад +1

    Hi Sir,
    Firstly, You are doing a great job in creating and delivering apt content. I am definitely going to recommend this channel to my peers.
    Few queries:-
    1.Is it always that Nan is replaced by median values? If not , please discuss different scenarios and how to choose best way to treat NA.
    2.Discovering an outlier just by eyeballing the boxplot and then following the process of finding Outlier boundaries seems an unreliable approach. So, shouldn't we just blindly pick a predictor column, apply the whole process of finding UL &LL outlies and replace with 99/1st quantile. Because, anyway there is no harm in that. If there will be an outlier it will get replaced or if not then data will remain unchanged.
    3. Why didn't we check for outliers in Income and Expense Column? Because I plotted the boxplot and found an outlier for that also.
    Kindly answers these so that I can scaffold my learning on Data Science. Thanks!

    • @UnfoldDataScience
      @UnfoldDataScience  2 года назад +1

      Love when you ask questions. Thanks a lot.
      1.median is just one way there are many ways to impute missing value. Search for "missing value imputation unfold data science" On RUclips.
      2.your approach may work however here for demonstration I plotted boxplot.
      3.Just to keep video length short, idea was to give a approach
      We should check and do the necessary treatment.

    • @ravneetkaur7278
      @ravneetkaur7278 2 года назад

      Thank you Sir for replying.
      1.ok
      2.OKay, so means we should make it a practice of outlier treatment for every predictor?
      3. Ok
      Thanks again!

  • @sumeetpansari
    @sumeetpansari 3 года назад

    very well Explained sir thanks. Any examples of problem to find out the problem is Classification or Regression?

  • @kalam_indian
    @kalam_indian 3 года назад

    please make a video on minimum minimum system requirements for implementing machine learning program.
    please include hardware configuration, operating system, visualisation software, best python framework and best python library etc.
    will be very grateful to you aman sir

  • @sadhnarai8757
    @sadhnarai8757 5 лет назад

    Nice explanation aman.....

  • @ajaynimmala2494
    @ajaynimmala2494 3 года назад

    Hi Aman, Thanks for the Video. Very well explained...
    1.When we are saying 10,25,50 percentile.. of data how this is being calculated..are we taking the max age in case 125 and dividing into 100 parts and calculating percentile?

  • @arunpandey4710
    @arunpandey4710 5 лет назад

    Thanks Aman for such very useful video it is realy very nice to understand.
    Can we have some function also for cleaning the data.

    • @UnfoldDataScience
      @UnfoldDataScience  5 лет назад +1

      Hello, about data cleaning, we need to follow different steps for different type of data cleaning. I have explained some of these in my model training videos. Please watch my end to end implementation videos from my playlist.

  • @NehaYadav-hs1po
    @NehaYadav-hs1po 3 года назад

    Hello sir, how to fill 'missing values' if column contains string datatype. how we can apply median ?

    • @UnfoldDataScience
      @UnfoldDataScience  3 года назад

      you can change datatype to float and find median,

    • @NehaYadav-hs1po
      @NehaYadav-hs1po 3 года назад

      @@UnfoldDataScience okay Sir Got it! thanks

  • @gangadharappa8938
    @gangadharappa8938 3 года назад

    Hello Aman. its a great video explanation. i got clarified so many doubts.
    It could be great if you can share this code here.

    • @UnfoldDataScience
      @UnfoldDataScience  3 года назад

      Thanks Ganga.
      drive.google.com/drive/folders/1XdPbyAc9iWml0fPPNX91Yq3BRwkZAG2M

  • @richasharma5949
    @richasharma5949 4 года назад

    Nice video. Could you please explain what determine if we should chose 99 or 95 percentile to remove outliers? Any examples?

    • @UnfoldDataScience
      @UnfoldDataScience  4 года назад +1

      Thanks Richa, there is no hard rule for it. Depends on case by case what works well for your model.

  • @bbrocks5530
    @bbrocks5530 4 года назад

    Are we getting all algos of supervised and unsupervised learning??

  • @abhishekraut6027
    @abhishekraut6027 3 года назад

    Hello Sir, You have explained very well
    But i have a one doubt
    1.Why you take 0.75 and 0.25 for quantile percentile ? Can we take other percentile instead of it?

  • @nipuquayum3409
    @nipuquayum3409 2 года назад

    Why is the count 14 ? Could you please explain?

  • @xyz1235746
    @xyz1235746 2 года назад

    Shouldn't you replace the outliers with the IQR upper and lower limit?

    • @UnfoldDataScience
      @UnfoldDataScience  2 года назад +1

      Should as a standard ML practice, here probably i missed as I wanted to show more things in python in limited time

  • @Rihannaluvs
    @Rihannaluvs 6 месяцев назад

    Unable to find path to my desktop

  • @kashmirachawan4148
    @kashmirachawan4148 4 года назад

    Please post the link of the next part of this video in reply section, I am not getting it. Thank you

    • @UnfoldDataScience
      @UnfoldDataScience  4 года назад

      ruclips.net/video/8PFt4Jin7B0/видео.html.
      Hi Karishma, you can go to playlist section and start watching.

  • @NehaYadav-hs1po
    @NehaYadav-hs1po 3 года назад

    please share the link of next video after this video! I am Unable to find it sir

    • @UnfoldDataScience
      @UnfoldDataScience  3 года назад

      pls check here:
      drive.google.com/drive/folders/1XdPbyAc9iWml0fPPNX91Yq3BRwkZAG2M

    • @NehaYadav-hs1po
      @NehaYadav-hs1po 3 года назад

      @@UnfoldDataScience thanks a ton!

  • @NehaYadav-hs1po
    @NehaYadav-hs1po 3 года назад

    sir i didnt understand the "finding and treating outliers- both upper and lower end" wala part!! :(

    • @UnfoldDataScience
      @UnfoldDataScience  3 года назад +1

      No of hours spend by person X on Netflix every week since last one year
      8, 9,7,10,8,7,9,10,8,0,8,20....................52 numbers here
      Here 0 is lower end outlier
      20 is upper end outlier

    • @NehaYadav-hs1po
      @NehaYadav-hs1po 3 года назад

      @@UnfoldDataScience oh got it thanks

  • @DaughterOfGodJG
    @DaughterOfGodJG 4 месяца назад

    Python Code and data set?

  • @KumR
    @KumR Год назад

    Can u share the code too?

  • @hi_10svideos86
    @hi_10svideos86 4 года назад

    why 0.25 and 0.75 for IQR?, not anything else.

  • @bharathkumar-fh2pi
    @bharathkumar-fh2pi 3 года назад

    can please keep the code.

    • @UnfoldDataScience
      @UnfoldDataScience  3 года назад

      drive.google.com/drive/folders/1XdPbyAc9iWml0fPPNX91Yq3BRwkZAG2M