Predict NBA Games With Python And Machine Learning

Поделиться
HTML-код
  • Опубликовано: 7 июн 2024
  • We'll predict the winners of basketball games in the NBA using python. We'll start by reading in box score data that we scraped in the last video. If you didn't watch the last video, you can still download the file (link below) and follow along.
    We'll do feature selection to identify good predictors, and train a machine learning model to make predictions. We'll end by computing rolling predictors and improving the model. We'll discuss how you can keep improving the model and predict future games.
    Links
    Full code and description of the project - github.com/dataquestio/projec...
    Dataset if you missed the previous video - drive.google.com/uc?export=do...
    Previous video where we did web scraping - • Web Scraping NBA Games...
    Chapters
    00:00 Introduction
    01:00 Reading in box score data
    06:10 Preparing data for machine learning
    16:10 Selecting the best features for the model
    25:31 Creating a baseline model
    36:06 Improving performance with rolling averages
    41:54 Add in opponent information
    51:11 Train a more accurate model
    55:08 Improving the model and making future predictions
    ---------------------------------
    Join 1M+ Dataquest learners today!
    Master data skills and change your life.
    Sign up for free: bit.ly/3O8MDef

Комментарии • 75

  • @alexandermackenzie5891
    @alexandermackenzie5891 Год назад +5

    Really enjoyed this! I'd love to see another video on how to predict future games. Thank you for the tutorial

  • @Qhorin
    @Qhorin Год назад +5

    I would also love to see actually predicting future games. Thanks for the content!

  • @ByBraiiaN
    @ByBraiiaN Год назад +4

    Wow. Would love to see one on predicting future games. Great video.

  • @mizew9149
    @mizew9149 Год назад +33

    Would love to see one on predicting future games. Great video. Very well done

  • @matthewmoore8445
    @matthewmoore8445 Год назад

    GREAT Video! I will be coding this and Implementing in my personal work. I would love to see a future video on how you go about predicting future games. I would also love to see something just like this for player performance, at a game by game level.

  • @pelumiadeleke-ademola2813
    @pelumiadeleke-ademola2813 Год назад +2

    another great vid, would love one on future predictions

  • @beraviousargentino9865
    @beraviousargentino9865 Год назад +1

    Same for future predictors!!! THX FOR THIS!

  • @carlossoto1466
    @carlossoto1466 Год назад +3

    I would love to see the future games please, I enjoy these videos and it helps me learn

  • @mikekennedy7073
    @mikekennedy7073 Год назад +1

    Great walk through. Would love to see how you update the next values for home and away teams

  • @artyom6230
    @artyom6230 Год назад +4

    It would be good if you can put a more detailed guide on Dataquest, to include predicting future matches using rolling averages etc. Would happily sign up just for that!

  • @pandithammultilingualcompu1552
    @pandithammultilingualcompu1552 6 месяцев назад

    Awesome explanation, I used this for my class, thank you

  • @tyler_russell
    @tyler_russell 8 месяцев назад

    Really great video. I learned some good ways to use list comprehensions in pandas to help with column names on top of the scikit learn fits. Thanks for this.

  • @meechmiliyan8965
    @meechmiliyan8965 Год назад

    Completed the first video, super awesome thank you!!! Does this video help with grabbing player stats and using AVG Reb, PTS, AST, etc to predict stats VS opponents ?

  • @nikolaytoporkov3537
    @nikolaytoporkov3537 Год назад

    simply awesome, thank you

  • @hodlsportclub
    @hodlsportclub Год назад

    Great tutorial 👌🏾 by any chance did you make the video on how to up date the model

  • @pauld428
    @pauld428 3 месяца назад

    Great video. Please make one about predicting future games.

  • @haydnwebtech
    @haydnwebtech 5 месяцев назад

    Your videos are brilliant! Horse racing would be an interesting project, using machine learning to predict which horse should win based on the stats for each runner in the race?

  • @greenfootprint2680
    @greenfootprint2680 5 месяцев назад

    Amazing channel mate! Are you able to demo. how to deploy ML models into production and what we could use to fully automate this end to end? Preferably with systems/platforms that are free to use.

  • @jamesmostofi2420
    @jamesmostofi2420 Год назад +1

    Please do a predictive video for future games 🙏🏻

  • @dylanhaynes275
    @dylanhaynes275 Год назад

    Great video, would be great if you could do one but that predicts total points scored, not necessarily in basketball.

  • @Dirty69
    @Dirty69 9 месяцев назад

    Great Tutorial!!!!

  • @Cobbtrades
    @Cobbtrades 6 месяцев назад

    When you doing the one to predict the future games i.e. value of 2 in the target column? Thanks

  • @TheDruss16
    @TheDruss16 4 месяца назад +1

    Excellent video that shows you how to use machine learning to identify the correlated factors that determine the outcome using previous games, but is a little misleading because it doesn't actually show you how to predict outcomes of future games. Would love to know where I can find this information, even if I have to pay for it.

  • @norgen4
    @norgen4 Год назад +2

    Please do one for future games!!

  • @CRKHB
    @CRKHB Год назад +1

    Hi, how could I attach the season to the predictions to see how well the model did for each individual season?

  • @ConsistentEV
    @ConsistentEV 3 месяца назад

    Hi there, just curious what would you say are the main things to look for when predicting games

  • @albertlarbi6231
    @albertlarbi6231 Год назад

    This was a great video but I would be happy if you would do one for the prediction of future games

  • @adamkrasowski9181
    @adamkrasowski9181 Год назад

    Hey,
    How long will it take to run SequentialFeatureSelector with the same parameters, but using RandomForestSelector or XGBoost as a model? Couple of hours , days ?

  • @Leon-nc3xk
    @Leon-nc3xk Год назад

    Hello, what is the algorithm used by the model and where could I get information on the logic behind the algorithm used by the model?? Thank you.

  • @tenienteale
    @tenienteale Год назад

    here I'm waiting the video on predicting future games... maybe someday will come

  • @usernameispassword4023
    @usernameispassword4023 Год назад

    Hi, I'm curious as to why this only results in a 64% accuracy.
    For example, something as simple as comparing the records of the teams at the time they've played and predicting the one with higher win% to win would result in around a 68% accuracy for the 2021-22 season.
    Is this due to ridge classification?

  • @jordanw4822
    @jordanw4822 Год назад

    Please make a video on how to predict future games!!🙏🙏

  • @eleftherias.3065
    @eleftherias.3065 Год назад

    Hi. I have two questions.
    a) Where did you find the data to use for your test?
    b) How easy is it for someone who don't know programming to learn python?

  • @brianbutler6672
    @brianbutler6672 3 месяца назад

    I've been working with this code for about three weeks now and I have successfully scraped all of the player stats too and want to somehow add a 'lineup' feature that looks at the MP of each player and how productive they tend to be to further improve the model. Any chance you would be willing to help me with that?

  • @nishchay89
    @nishchay89 Год назад

    Hi!
    Why did we use ridge classification?

  • @williamrowe2296
    @williamrowe2296 Год назад

    How would I filter out rows of games that were in the playoffs so I just have regular season games in the dataframe?

  • @rjvaughn
    @rjvaughn Год назад

    When computing the rolling averages, why did you not use the 'left = X' paramater, like you did in your football predictor video? Don't your rolling averages include knowledge of the current game you are predicting?

  • @ultieme007
    @ultieme007 Месяц назад

    at 38:48 i get an error when running the function find team averages for last 10 games, that i can not resolve. would it have something to do with the error showing in the video, the futurewarning?

  • @shukkkursabzaliev1730
    @shukkkursabzaliev1730 Год назад +1

    Hey! Great video (complicated too, gotta watch second time) I would personally benefit very much from a video on how to use this for future matches, pleaseee!

  • @Nerfgunninja
    @Nerfgunninja 8 месяцев назад

    stay strong, Coulibaly is going to be a star

  • @PjFlipStudio
    @PjFlipStudio 4 месяца назад

    please make video on how to predict future games

  • @AbrarMuhtasim
    @AbrarMuhtasim Год назад

    'Customer segmentation in retail using machine learning' please make a video on this topic using real dataset.😥😥🙏🙏

  • @coconutnut21
    @coconutnut21 5 месяцев назад

    would love to see total score predictor sir.

  • @nishchay89
    @nishchay89 Год назад

    Why do we need player stats which have max in front of them? What is the purpose of max stats ? Can anyone help clarify please?

  • @Cris_the_coder
    @Cris_the_coder Год назад +2

    give this man move views so we get another part !!!!

  • @isi6402
    @isi6402 2 месяца назад

    Hi sir,
    Where to deploy this project.

  • @marcyoussef3313
    @marcyoussef3313 Год назад

    hello Please can you show how can we select 2 teams and than the AI would chose who wins , like please write the code in the reply

  • @haimanottiruneh2491
    @haimanottiruneh2491 Год назад +2

    How did you decide to chose ridge classifier?

    • @qpe04
      @qpe04 Год назад

      I wonder why using ridge classifier but not logist regression in the SequentialFeatureSelector

  • @Philgob
    @Philgob Год назад

    I need some clarification here... The data in the training and test set contain the points scored by each teams, how can the model not predict exactly if the game is a win or a loss? It literally just has to check if the team has more points and return true if it does... I am confused

    • @Dataquestio
      @Dataquestio  Год назад +1

      We're predicting the winner of the next game. The algorithm doesn't know what happened in the next game when it is making predictions.

    • @Philgob
      @Philgob Год назад

      @@Dataquestio facepalm

  • @AlphaDoggs
    @AlphaDoggs Год назад +1

    predict future nfl games please

  • @bhavyamehra6931
    @bhavyamehra6931 9 месяцев назад

    Doesnt rolling 10 include current game for rolling average? wouldnt that be a leakage?

    • @SMK3211
      @SMK3211 Месяц назад

      No, because you predict always the next game

  • @predictoredge_live
    @predictoredge_live 4 месяца назад

    Excellent video for learning, but doesn't actually show you how you can predict future games. Future games do not have all the box score stats, which make it difficult project outcomes for the future based upon what this video is demonstrating. Some help or an additional video would be much appreciated showing how to actually use this to predict future games (or games that have not yet occurred).

  • @kunal6353
    @kunal6353 Год назад

    which ide u r using

  • @TjSpoonManJacques
    @TjSpoonManJacques Год назад

    INSTANT FOLLOW!!!!

  • @kfaslus
    @kfaslus 5 месяцев назад

    hello brother, can you help me with this line that is generating the following error:
    line:
    df_rolling = df[list(selected_columns) + ["won", "team", "season"]]
    def find_team_averages(team):
    rolling = team.rolling(10).media().
    return rolling
    df_rolling = df_rolling.groupby(["team", "season"], group_keys=False).apply(find_team_averages)
    error:
    DataError: Cannot aggregate non-numeric type: object.

    • @SMK3211
      @SMK3211 5 месяцев назад

      Did you manage to solve that?

    • @bena.9440
      @bena.9440 5 месяцев назад +2

      I believe you need to change that line to rolling = team[selected_columns].rolling(10).mean()

    • @AIMadesy
      @AIMadesy 4 месяца назад

      this works@@bena.9440

    •  2 месяца назад

      @@bena.9440 Yes it is

  • @ramfanintexas
    @ramfanintexas Год назад

    does this for loop need to be updated for my pc?
    for url in standings_pages:
    save_path = os.path.join(STANDINGS_DIR, url.split("/")[-1])
    if os.path.exists(save_path):
    continue

  • @EzraSchroeder
    @EzraSchroeder Год назад +1

    We would really like to see a video on predicting future games in the NBA. Even though this would be a horrible use of PyTorch, I would like to see it done with PyTorch, as well as a wide variety of other machine learning models & technologies (sklearn, etc.). It would also be nice to see some work with regards to this done on Kaggle as well, for example using NBA datasets as well as NCAA datasets.

  • @kadbed
    @kadbed 5 месяцев назад +1

    Is anyone facing 'Cannot aggregate non-numreic type:object' error while trying to this:
    df_rolling = df_rolling.groupby(["team", "season"], group_keys=False).apply(find_team_averages)

    • @kfaslus
      @kfaslus 5 месяцев назад +1

      I have the same problem with that error.

    • @akashgahlaut4078
      @akashgahlaut4078 3 месяца назад

      i have solved this issue

    • @DaniloKacanski875
      @DaniloKacanski875 3 месяца назад

      How did u solve that?
      @@akashgahlaut4078

  • @izchak333
    @izchak333 Год назад

    the full
    del in begining shuld be like this
    del df['index_opp']
    del df['mp.1']
    del df['mp_opp.1']
    del df['mp_max_opp.1']
    del df['mp_max.1']

  • @mkzzzzzzzzzz1
    @mkzzzzzzzzzz1 Год назад +1

    Unrelated, but the previous nba score scraper took like 3 days to scrape 2016-2022. OH MY DAYS.

    • @Dataquestio
      @Dataquestio  Год назад +2

      Yeah, it has to scrape a lot of records (8500), and there is a time.sleep in the loop. Each record should take about 6 seconds to download. There's also a small chance that it will time out after 30 seconds of trying and need to retry. We can guesstimate the runtime with (8500 * 6 + 8500 * .05 * 60) / 3600 = 21.25, so it should take about 21 hours to run.
      You could try reducing the sleep time and timeout times for playwright, but there is a risk of getting banned by the server.

    • @hodlsportclub
      @hodlsportclub Год назад

      How do I add to the date selection line this new season from October to now