Build Your First Machine Learning Project [Full Beginner Walkthrough]

Поделиться
HTML-код
  • Опубликовано: 22 июл 2024
  • We'll learn how to build an end-to-end machine learning project. We'll cover the main steps in building a machine learning project, then walk you through writing the Python code to create the project.
    In the project, we'll try to predict how many medals each country will win in the olympics using a linear regression model.
    At the end, you'll have a full machine learning project that you can continue working on.
    You can find the README and code here - github.com/dataquestio/projec... .
    Chapters
    00:00 Introduction
    00:40 7-step project process
    10:15 Loading the data
    12:10 Data exploration
    18:05 Building our model
    22:30 Measuring error
    26:30 Is the model good?
    34:20 Wrap-up and next steps
    ---------------------------------
    Join 1M+ Dataquest learners today!
    Master data skills and change your life.
    Sign up for free: bit.ly/3O8MDef

Комментарии • 92

  • @michaelmitchell155
    @michaelmitchell155 10 месяцев назад +1

    A very comprehensive and well explained intro into the workings of the project. I got a lot out of it. Thank you.

  • @kumelachewmaru2225
    @kumelachewmaru2225 Год назад +5

    love the simplicity of your step by step method. I am absorbing a lot in just one pass. Thank you and well done.

  • @Fakipo
    @Fakipo Год назад +21

    I was just studying the concepts for so long and getting overwhelmed, this video definitely helped to get the bigger picture.

  • @hiteshallakki1740
    @hiteshallakki1740 2 года назад +5

    Great video.Really liked the way you explained it before ,instead diving into the code.Thanks

  • @allahjoseph
    @allahjoseph 9 месяцев назад +2

    Thank you for providing such a great resource and making ML so digestible! YOU are who introduced me to machine learning, and I love it. I'm looking forward to applying everything I learn to my own projects!!!

  • @Prathmesh_salve
    @Prathmesh_salve 3 месяца назад +2

    First person i saw who is explaing just perfectly and can be understand by a student thanks ane keep it up sir.

  • @pluderr3947
    @pluderr3947 4 месяца назад +11

    @ 12:14
    teams.corr()["medals"]
    didn't work for me so I did
    corr = teams.drop(["team", "country"], axis=1).corr()["medals"]
    print(corr)
    for those who are also running into the same issues as me :)

    • @kayo5011
      @kayo5011 3 месяца назад

      It worked thanks

    • @user-kj6vz1qo4h
      @user-kj6vz1qo4h 3 месяца назад +2

      or u can use teams.corr(numeric_only="true")["medals"]

    • @rodo2220
      @rodo2220 2 месяца назад

      @@user-kj6vz1qo4h thank you!

    • @JoshuaStorm-zi1wy
      @JoshuaStorm-zi1wy Месяц назад

      @@user-kj6vz1qo4h Thanks!

    • @kavinesh4470
      @kavinesh4470 Месяц назад

      Thanks🙏🙇

  • @sm-pz8er
    @sm-pz8er 5 месяцев назад

    Perfect. Best video I’ve found precious and easy to understand so far. Thank you

  • @maxivy
    @maxivy Год назад +2

    You are a very good teacher and deserve more subs.

  • @user-wp6lj6xl5z
    @user-wp6lj6xl5z 11 месяцев назад +1

    excellent video sir ji.... thanks a lot for such concepts... and your English fluency is amazing Indian

  • @bumohamed624
    @bumohamed624 Год назад +1

    Thanks a lot , it helps to understand ML with basic steps

  • @DEDE-ix9lg
    @DEDE-ix9lg 10 месяцев назад

    Amazing . this was simple and great . very very very well done !!!!

  • @josearmandovivero408
    @josearmandovivero408 Год назад

    Thanks! This video is exactly what I needed 😀

  • @sushantshankar8477
    @sushantshankar8477 3 месяца назад

    Loved it! superb explanation 😍

  • @chessconfused6528
    @chessconfused6528 Год назад

    Your courses are awesome!!!

  • @scarlettran-
    @scarlettran- 8 месяцев назад

    thank you so much!!! you are a really good teacher

  • @rosemaryonondje7953
    @rosemaryonondje7953 2 года назад

    A great video!
    This answered some of my questions. Thanks

  • @TT-oy8bq
    @TT-oy8bq 3 месяца назад

    Incredible Teaching !

  • @willII0522
    @willII0522 Год назад +1

    I like how you showed to use the later data to test the model, but do you have a video that shows how to use the data to predict the future Olympics?

  • @viewpoint8976
    @viewpoint8976 12 дней назад

    This video realy shows how things are done.

  • @ranahuzaifa147
    @ranahuzaifa147 10 месяцев назад

    Thank you for the video.

  • @IsoAktiv
    @IsoAktiv 11 месяцев назад +2

    Albania was in the olympics 1992 and i guess any other countries in that csv were also. They just did not win any medals, that's why there are missing values. So actually setting them to zero instead of dropping them is more accurate. In theory you would prefer first or second party data, in this case u would have to do some research to clarify the reson for missing values in the data set.

  • @tajinjahan7446
    @tajinjahan7446 Год назад +1

    hi... how to sort the excel data into integer values?

  • @crispineda4630
    @crispineda4630 Год назад

    Is there a reason why the Plots disappear after running the code a second time on Jupyter notebook? They don't show anything anymore.

  • @luqmanjuzaili5213
    @luqmanjuzaili5213 Месяц назад +1

    Thank you for the amazing video! However, when I tried running this, I received a value error
    teams.corr()["medals"]
    This seems to be because the "Team" and "Country" column are in string, and hence making it impossible to get a corr value. So i removed them just to obtain the corr values. But it seems to work for you without filtering the string type columns out. Any ideas why?

  • @sabuein
    @sabuein Год назад

    Thank you.

  • @ShivendraParmar-dp3rp
    @ShivendraParmar-dp3rp 26 дней назад

    guys why after test['predictions']=predictions , size of array disturbing instead of 405*8 its coming 405*413 can anyone help me out with it

  • @muradbayr9900
    @muradbayr9900 Месяц назад

    Please guys help me on the first step got stuck cannot import csv kinda problem with pandas

  • @rayr268
    @rayr268 5 месяцев назад

    Would love a math course that is shown directly relating to ML that I can take to get up to speed. for someone that might be self taught in tech w/ only a highschool education

  • @deepakkumaracid4529
    @deepakkumaracid4529 4 месяца назад

    From where I got data?

  • @hanazhafirahhanifah8175
    @hanazhafirahhanifah8175 Год назад +1

    Thank you for such a nice video! I have a question though about the error_ratio. You said countries like FRA, CAN, and RUS get a lot of medals in the olympics and it shown that their error ratio is low.
    With what should I compare the value of error_ratio?

    • @1622roma
      @1622roma Год назад

      what a good question! I hope he responds back to you.

    • @yayasssamminna
      @yayasssamminna Год назад

      why do you want to compare it?

  • @user-lw8zw5lq8l
    @user-lw8zw5lq8l 10 месяцев назад +2

    sir that was really simple and very well explained also excellently organised...... yet I struggled at one point I couldn't convert string(teams) to float while performing the corelation....if you see this hope you reply .....

    • @SiddheshRajale
      @SiddheshRajale 9 месяцев назад

      did you found out the solution

    • @darrentan271
      @darrentan271 4 месяца назад

      @@SiddheshRajaledf.corr(numeric_only=True)

  • @user-cb6dm1qd4v
    @user-cb6dm1qd4v 8 месяцев назад

    Great video! What coding software did you end up using for this (I haven't seen this python software before which is why I ask)?

  • @raanonyms7926
    @raanonyms7926 6 месяцев назад

    My like turned this to 2K 😊

  • @nagrotte
    @nagrotte 7 месяцев назад

    best

  • @haythamroshdy4189
    @haythamroshdy4189 Год назад +2

    I love your English
    Your English is so perfect as indian

  • @prashanthbabu1397
    @prashanthbabu1397 Год назад +2

    Hi , I really loved your video. I was trying to follow along, but got an error and cant move forward. I would love it if you could help me fix it. i got an error for the predictions = reg.predict(test[predictors]). It kept saying ValueError: The feature names should match those that were passed during fit.
    Feature names unseen at fit time:
    - age
    - country
    - medals
    - team
    - year
    what do i do?

    • @moyinoluwaanoma
      @moyinoluwaanoma 9 месяцев назад

      Hello,
      Did you get this resolved yet? Having the same issue now.

  • @raja.57
    @raja.57 Год назад +1

    It will be a good pratcise to use x_test,y_test,x_train,y_train instead of predictors, target,
    and it wil also be a good practise to use x , y as independent and dependent variable instead of test , and so on

  • @bumohamed624
    @bumohamed624 Год назад +1

    it gives an error when i run correlation step complaining on data type of team, how can handle ?

    • @Mynamegeoph
      @Mynamegeoph Год назад

      I have this too, were you able to fix it?

    • @lalithsai5392
      @lalithsai5392 Год назад +12

      @@Mynamegeoph teams[teams.columns[2:]].corr()["medals"] use this

    • @noisysod7330
      @noisysod7330 11 месяцев назад

      @@lalithsai5392 Thanks lalithsai5392, would have been stuck without you!

    • @gmfPimp
      @gmfPimp 9 месяцев назад

      I bet this is an issue with doing it locally and not using a Jupyter Notebook because I had this problem as well.
      The best way around this is:
      teams.corr(numeric_only=True)["medals"]
      That will only generate value against numeric fields.

    • @allahjoseph
      @allahjoseph 9 месяцев назад

      code community!! @@lalithsai5392

  • @paaviethranjayabalan6735
    @paaviethranjayabalan6735 8 месяцев назад

    why seaborn but not matlib>?

    • @bhu0091
      @bhu0091 5 месяцев назад +1

      you can use whatever you like, it's all about experimenting ;)

  • @baeche
    @baeche 8 месяцев назад

    Great video. What python interpreter are you using?

    • @eduardtoronto
      @eduardtoronto 8 месяцев назад

      Maybe you meant IDE (integrated development environment)? Python only has one interpreter, it's builit-in and it compiles/interpretes the code. I'm pretty sure the IDE he is using in the video is Project Jupyter (interactive development environment) which is pretty much a standard environment in machine learning, data analytics, statistical analysis etc.

    • @baeche
      @baeche 8 месяцев назад

      Sorry, of course I meant IDE@@eduardtoronto What differs from mine (PyCharm) is that the code gets executed immediately and the result are showd. I have to use the print command for that. Or is the video just edited?

    • @eduardtoronto
      @eduardtoronto 8 месяцев назад

      @@baeche In jupyter ENTER inserts a new line, SHIFT+ENTER executes the code. Everything gets executed immediately. It depends on the functions he's using e.g. copy() gets executed but it wont print any output whereas something like 'shape' will output the result to the console like print.

    • @baeche
      @baeche 8 месяцев назад

      Thank you very much@@eduardtoronto I moved to google colab where the last command gets printed too. I find google colab handy as I can work in the browser. Where I experience problems is accessing a SQL Server (not SQ Lite, mysql). Any idea where I can look for help? ChatGPT could not.

  • @PRO-to7il
    @PRO-to7il День назад

    25:45

  • @ShortLessonsDaily
    @ShortLessonsDaily 2 месяца назад

    Telugu lo chey bro

  • @praskatti
    @praskatti 8 месяцев назад +2

    Great video. Thanks for sharing your knowledge and expertise. I ran into an issue in the "corr()" step.
    teams.corr()["medals"]
    ValueError: could not convert string to float: 'AFG'. May be I can remove this column before doing the corr() call.

    • @h4ytham268
      @h4ytham268 5 месяцев назад

      i had the same issue. what did you do to solve it?

    • @hongyangtan9897
      @hongyangtan9897 5 месяцев назад +1

      teams.drop(["country", "team"], axis=1).corr()["medals"]
      this code can work

    • @abhijeet800
      @abhijeet800 5 месяцев назад +3

      add this to corr(numeric_only=True)["medals"]

    • @user-ew4jp1fk3p
      @user-ew4jp1fk3p 4 месяца назад

      @@hongyangtan9897 thank uu

  • @daudisraf5564
    @daudisraf5564 3 месяца назад +1

    There seems to be a problem when I run 'teams.corr()["medals"]'. Keeps throwing an error "ValueError: could not convert string to float: 'AFG''. Checked unique values and NaN. confused!

    • @liewkangzhen157
      @liewkangzhen157 Месяц назад +2

      I faced the same problem as well, but managed to solve it. The error is due to some columns in teams that are nonnumerical like team and country, so i created a new table, ie teams = teams.drop(columns = [‘team’, ‘country’]) and it should work. Hope this helps.

  • @vanshikatripathi2579
    @vanshikatripathi2579 4 месяца назад

    teams=pd.read_csv("teams.csv")
    This line giving me a huge error
    how to correct it or what i had wrong

    • @user-ew4jp1fk3p
      @user-ew4jp1fk3p 4 месяца назад

      go to the document u download and take it's link and put it Instead teams.csv

  • @user-jg1bk9sd4r
    @user-jg1bk9sd4r 5 месяцев назад

    I like how you showed to use the later data to test the model, but do you have a video that shows how to use the data to predict the future Olympics?