Time Series Forecasting with XGBoost - Use python and machine learning to predict energy consumption

Поделиться
HTML-код
  • Опубликовано: 16 май 2024
  • In this video tutorial we walk through a time series forecasting example in python using a machine learning model XGBoost to predict energy consumption with python. We walk through this project in a kaggle notebook (linke below) that you can copy and explore while watching.
    Notebook used in this video: www.kaggle.com/code/robikscub...
    Timeline:
    00:00 Intro
    03:15 Data prep
    08:24 Feature creation
    12:05 Model
    15:35 Feature Importance
    17:33 Forecast
    Follow me on twitch for live coding streams: / medallionstallion_
    My other videos:
    Speed Up Your Pandas Code: • Make Your Pandas Code ...
    Speed up Pandas Code: • Make Your Pandas Code ...
    Intro to Pandas video: • A Gentle Introduction ...
    Exploratory Data Analysis Video: • Exploratory Data Analy...
    Working with Audio data in Python: • Audio Data Processing ...
    Efficient Pandas Dataframes: • Speed Up Your Pandas D...
    * RUclips: youtube.com/@robmulla?sub_con...
    * Discord: / discord
    * Twitch: / medallionstallion_
    * Twitter: / rob_mulla
    * Kaggle: www.kaggle.com/robikscube
    #xgboost #python #machinelearning

Комментарии • 401

  • @casperj4784
    @casperj4784 Год назад +95

    A comprehensive yet succinct tutorial. And, having only just finished my Data Science degree, I found it very reassuring to see that you do get faster and more proficient with time.

    • @robmulla
      @robmulla  Год назад +14

      I absolutely love messages like this. Glad to hear you found this helpful and it gave you the reassurment that things get faster. I can tell you that they do! The goal of my channel is to "spark curiosity in data science" I hope this video did that for you.

    • @RaviKumar-uf3eo
      @RaviKumar-uf3eo Год назад

      Yes. It is very reassuring, but most probably he would have kept all the things ready.

    • @amirghorbani7922
      @amirghorbani7922 5 месяцев назад

      It is better to use icdst Ai predict lstm model.

  • @karishmakapoor4285
    @karishmakapoor4285 Год назад +5

    Amazing flow, comprehensive yet smooth. Detailed yet generic. I love the way you think and your float across the entire process. I did this project myself and thoroughly enjoyed it. Cant wait to apply this to other datasets. A Big thumps up👍

  • @naderbazyari2
    @naderbazyari2 Месяц назад +1

    Second time watching this and doing every step on my notebook as Rob goes through the task. I am still blown away by the intricacy of his approach and how he investigates the case. fascinating how he makes it look effortless. Many thanks

  • @sevenaac4783
    @sevenaac4783 26 дней назад +1

    Thank you for teaching me. It allows me to understand the time series XGBoost in the shortest time.

  • @musicplace9205
    @musicplace9205 4 месяца назад

    Thanks! one of the best video I've ever seen. Simple, clear and overall why each concept is used for.

  • @flel2514
    @flel2514 Год назад +6

    Hi Rob, I am a fresh data science graduate, and I find this tutorial very well done and very helpful for those that approach TS for the first time as well as for those that want to refresh the topic

  • @rodolfoviegas8504
    @rodolfoviegas8504 10 месяцев назад

    Amazing. We've learnt time series prediction only by statistical methods and/or making ML models to act like ARIMA - making lags for feed them. This approuch very interesting and intuitive. Thanks, Rob

  • @ADaBaker95
    @ADaBaker95 7 месяцев назад

    Best video on the subject I've found so far!

  • @jelc
    @jelc Год назад +3

    Really well focused and clearly explained. Love your work!

    • @robmulla
      @robmulla  Год назад

      I appreciate the feedback Julian

  • @beckynevin1
    @beckynevin1 2 месяца назад

    Wow! I'm trying to get up to speed on XGBoost, so I clicked on this video. There are a lot of meh data science tutorials out there, so it was such a treat to come across this one after slogging through youtube. I immediately subscribed and am headed to your channel to watch more videos on time series prediction!

  • @fudgenuggets405
    @fudgenuggets405 Год назад

    I like this dude's videos. They are informative and to the point.

  • @Singularitarian
    @Singularitarian Год назад

    Very illuminating! Learned a whole lot in just 23 minutes.

  • @troy_neilson
    @troy_neilson Год назад

    Informative and well-structured. Thanks!

  • @MilChamp1
    @MilChamp1 Год назад +32

    This was a very nice introduction to this topic. You might consider turning this into a miniseries, since it's such a large topic; the next video might be on how to create the best cross-validation splits for timeseries

    • @robmulla
      @robmulla  Год назад +9

      Thanks so much. There is so much to cover with time series. I may consider a miniseries that’s a great idea. I’d like to make one on prophet which is a great package for time series forecasting too.

  • @22niloc
    @22niloc 9 месяцев назад

    I'm getting to know Time Series and your vid has loads of great starter points.

  • @TrueTalenta
    @TrueTalenta 9 месяцев назад

    I am new to time series and this by far is very informative and quit succinct!

  • @69nukeee
    @69nukeee 6 месяцев назад

    Such an amazing video, thank you Rob and keep 'em coming! ;)

  • @hussamcheema
    @hussamcheema Год назад +3

    I love your content. Liked the video before watching it because I know this is gonna be a great tutorial.
    Thanks for making these tutorials. 😊

    • @robmulla
      @robmulla  Год назад +1

      Thanks! Glad you find it helpful.

  • @JacksonWelch
    @JacksonWelch Год назад +12

    Love these videos. As a data engineer I love seeing other peoples workflows. Thanks so much for posting.

    • @robmulla
      @robmulla  Год назад +1

      Glad you liked it. Thanks for watching Jackson.

  • @sandyattcl
    @sandyattcl Год назад +3

    what an amazing tutorial! I just had to give a thumbs up even before finishing the video.

    • @robmulla
      @robmulla  Год назад

      Really appreciate that Sandeep. Please share the link with anyone else you think might also like it.

  • @egermani
    @egermani Год назад +1

    Great content! Thanks a lot for the explanations, they are a great incentive to dive deeper into the subject.

    • @robmulla
      @robmulla  Год назад

      Glad you think so! My hope is that by making short videos that explain a topic at a high level like this will spark curiosity in people so they will dive deeper into the topic, just like you said.

  • @NotesandPens-ro9wx
    @NotesandPens-ro9wx 4 месяца назад

    Man I am seeing this after an year and your teaching style is just hell .. now sub done and will follow you on other things :) for sure

  • @inovosystemssoftwarecompan6724
    @inovosystemssoftwarecompan6724 4 месяца назад

    short and potent, great fluid presentation !!

  • @a.h.s.3006
    @a.h.s.3006 Год назад +24

    I worked with time series before, and this tutorial is very thorough and well made.
    Additional features you could think about are lag/window features, where you basically try to let the model cheat from the previous consumption, by giving it a statistical grouping of previous values, let's say the mean of consumption within a window of 8 hours, or by outright giving the previous value (lag), let's say the actual consumption 24 hours ago.
    This will greatly improve performance, because it helps the model to go follow the expected trend.

    • @robmulla
      @robmulla  Год назад +5

      Thanks for the comment! Glad you enjoyed the video even though you already have experience with time series. You are 100% correct about the lag features. Check out part 2 where I go over this and a few other topics in detail.

  • @PRATEEK30111989
    @PRATEEK30111989 3 месяца назад

    I have never seen a better data science video. You are a savant at this

  • @lolmatt9
    @lolmatt9 3 месяца назад

    Very well explained and useful. Thank you!

  • @azizbekurmonov6278
    @azizbekurmonov6278 10 месяцев назад

    Thanks! Love your explanations.

  • @nirbhay_raghav
    @nirbhay_raghav Год назад +27

    Hands down, the bestest (if that is a word) video on the entire internet about implementation. No fancy stuff. Not too beginner and toy examples. Hust the right thing what a budding data scientist needs to see. And it is definitely reassuring to see that one can really get better and faster at doing these after a while. It takes me a lot of time reach what you have done in under 30min. Debugging things take a lot of time.

    • @robmulla
      @robmulla  Год назад +1

      I really apprecaite your positive feedback! Glad to hear you find it encouraging that eventually things will get faster.

  • @H99x2
    @H99x2 Год назад +2

    Incredible content and explanation. You definitely have a knack for this. I subscribed for more videos like this! Thanks :)

    • @robmulla
      @robmulla  Год назад +1

      Thanks for watching and the feedback!

  • @michaelmebratu2921
    @michaelmebratu2921 11 месяцев назад +1

    What a quality tutorial! Thank you so much

    • @robmulla
      @robmulla  11 месяцев назад

      Glad you learned something new!

  • @user-xr3bc4vn5t
    @user-xr3bc4vn5t 7 месяцев назад

    You have helped me so much with this video, you don't even know!!! Thanks so much :)

  • @zhuoningli
    @zhuoningli Год назад +13

    Hi Rob! Your tutorials help me get a job offer! When I was searching for a job, I received a take-home technical exercise about time series forecasting. I watched this video and finished my exercise. Finally, I got my dream job! Thank you so much!!! I really appreciate your tutorials! 🥰

    • @robmulla
      @robmulla  Год назад +5

      Whoa, I really love hearing stories like this. That's amazing and I wish you the best in the rest of your career.

  • @adityaraikwar6069
    @adityaraikwar6069 8 месяцев назад +5

    Being a sort of early intermediate data scientist myself, it's very cool watching him do all these things and the most amazing thing is how everybody's mind works differently and how proficient you become in not only coding but also in approach towards a problem. keep that up man

    • @paultvshow
      @paultvshow 5 месяцев назад

      Hey, have you landed a job in data science field?

    • @digitalnomad2196
      @digitalnomad2196 3 месяца назад

      also curious to know, recent data science graduate here@@paultvshow

  • @peralser
    @peralser Год назад +1

    Great Video ROB, Thanks for sharing with us!!

    • @robmulla
      @robmulla  Год назад

      Thanks for watching!

  • @evandrogaio7003
    @evandrogaio7003 Год назад +1

    Such an excellent video. Thanks for sharing!

  • @Arieleyo
    @Arieleyo Год назад +1

    Love your videos Rob!! cheers from Argentina ♥

    • @robmulla
      @robmulla  Год назад

      Sending my ❤ back to Argentina. Thanks for watching!

  • @leo.y.comprendo
    @leo.y.comprendo Год назад +1

    This is incredible! Instantly subscribed!! thanks for your knowldege

    • @robmulla
      @robmulla  Год назад

      Thanks for watching!

  • @Tonitonichoppa_o
    @Tonitonichoppa_o Год назад

    This is the best!! Thank you so much :D 감사합니다!!

  • @demaischta1129
    @demaischta1129 Год назад

    This is so helpful. Thank You!!

  • @akshaymbhat9144
    @akshaymbhat9144 Год назад +1

    Thanks for the wonderful video. It's very insightful ❤️ from India .
    Keep inspiring and aspiring always!!

    • @robmulla
      @robmulla  Год назад

      My pleasure! So happy you liked it!

  • @Burnitall220
    @Burnitall220 2 месяца назад

    This is incredible!!

  • @anatoliyzavdoveev4252
    @anatoliyzavdoveev4252 6 месяцев назад

    Fantastic video tutorial 👏👏🙏

  • @kvafsu225
    @kvafsu225 Год назад +1

    Great lesson on machine learning. Thank you.

    • @robmulla
      @robmulla  Год назад

      Thank you for watching. Share with a friend!

  • @tatulialphaidze90
    @tatulialphaidze90 Год назад +1

    Thank you for this tutorial, definitely helped me out

  • @yosafatrogika3129
    @yosafatrogika3129 Год назад +1

    so clear explanation, thanks for sharing!

    • @robmulla
      @robmulla  Год назад

      Glad it was helpful!

  • @gabrielmoreno2554
    @gabrielmoreno2554 Год назад +4

    Wow, this is exactly what I needed to learn to improve my COVID death predictor. Great job!

    • @robmulla
      @robmulla  Год назад +1

      So glad you found this helpful. Thanks for watching!

  • @lamborghiniveneno8423
    @lamborghiniveneno8423 Год назад +1

    Simply awesome tutorial😀

  • @Dongnanjie
    @Dongnanjie 3 месяца назад

    Thank you, Rob!

  • @lovettolaedo223
    @lovettolaedo223 7 месяцев назад

    I enjoyed watching this as it has given me more insight into prediction.
    Kindly do a video on GDP growth forecasting using machine learning.
    Thank you.

  • @romanrodin5669
    @romanrodin5669 Год назад +1

    Great video! Very clear and easy for understanding! Thanks a lot for clear explanation! I've got a few questions though regarding lagging data for better prediction) will jump into next video, it seems I get an answer there) thanks again!

    • @robmulla
      @robmulla  Год назад

      Glad you liked it. Yes, the next video covers it in more detail!

  • @massoudkadivar8758
    @massoudkadivar8758 Год назад

    Perfect job👌

  • @yourscutely
    @yourscutely Год назад +1

    Perfectly explained, thanks a lot

    • @robmulla
      @robmulla  Год назад +1

      You are welcome! Glad you found it helpful. Check out parts 2 and 3 and share with a friend!

  • @prasadjayanti
    @prasadjayanti 27 дней назад

    Very good explanation.

  • @super-eth8478
    @super-eth8478 Год назад +1

    Dude your channel is a gold mine ..

    • @robmulla
      @robmulla  Год назад

      Thanks so much for that feedback. Now share it with anyone you think might appreciate it too!

    • @super-eth8478
      @super-eth8478 Год назад +1

      @@robmulla Actually I have shared it to my friends . Cheers !

  • @adityagavali3158
    @adityagavali3158 10 месяцев назад

    Thank for this!

  • @gustavojuantorena
    @gustavojuantorena Год назад +1

    "And depending who you ask" 🤣Great video!

    • @robmulla
      @robmulla  Год назад +1

      I’m glad you got the reference. I was hoping he would see and appreciate that part of the video.

  • @ramizajicek
    @ramizajicek Год назад +1

    Thank you for the great presentation

    • @robmulla
      @robmulla  Год назад

      I appreciate you watching and commenting. Share with a friend!

  • @tomshaw7179
    @tomshaw7179 Год назад +2

    Thanks for this video Rob. I am quite new to data science and this was really clear. Have you done a video on optimization maybe using light GBM?

  • @chrispumping
    @chrispumping 10 месяцев назад

    Very informative and easy to understand tutorial....Thanks you

    • @robmulla
      @robmulla  10 месяцев назад

      You are welcome! Thanks for watching.

  • @raasheedpakwashi2961
    @raasheedpakwashi2961 Год назад +1

    LEGEND...no other words needed

  • @nguyenduyta7136
    @nguyenduyta7136 Год назад +1

    Best one I ever seen ❤thank so much.

    • @robmulla
      @robmulla  Год назад +1

      So glad you like it. Thanks for the comment.

  • @liliyalopez8998
    @liliyalopez8998 Год назад +4

    I just started studying ML and this tutorial is super helpful. I would like to see how you would use the model for forecasting future energy consumption though

    • @robmulla
      @robmulla  Год назад +3

      Welcome to the wonderful world of ML Liliya! Yes, I did forget to cover that in detail but I may in a future video. It's just a simple extra step to create the future dates dataframe and run the predict and feature creation on it.

  • @blueradium4260
    @blueradium4260 Год назад +1

    Brilliant video, thank you :)

    • @robmulla
      @robmulla  Год назад +1

      Thanks for taking the time to watch.

  • @haleemahabulaimon8081
    @haleemahabulaimon8081 10 месяцев назад

    I really appreciate it

  • @user-cl1eb2hh8o
    @user-cl1eb2hh8o 2 месяца назад +1

    謝謝!

  • @nandojau1
    @nandojau1 8 месяцев назад

    nice!!!!

  • @ademhilmibozkurt7085
    @ademhilmibozkurt7085 Год назад

    I love this video. Please make more. Thanks

    • @robmulla
      @robmulla  Год назад

      Thanks! I apprecaite the comment. Have you seen the part 2 that I have on this topic?

  • @lucasfescina
    @lucasfescina Год назад

    I love your videos

  • @ChrisHalden007
    @ChrisHalden007 Год назад +1

    Great video. Thanks

    • @robmulla
      @robmulla  Год назад

      Appreciate that 🙏

  • @wells111able
    @wells111able 9 месяцев назад

    thanks a lot ,for a beginner

  • @kaaz4044
    @kaaz4044 9 месяцев назад +2

    A question. I see the prediction was done on test data which are already available. This is good to see how accurate the model is but I am wondering how we can use this model (and xgboost in general) to forecast the upcoming years for which we do not have any data.

  • @MeghaKorade
    @MeghaKorade Год назад +3

    Hello Rob, Great tutorial! I have a question - In eval_set you're using [(x_train, y_train), (x_test, y_test)] whereas in most data split practices I've seen validation set separated from training data (which not part of either training or testing set)? Can you please check at timestamp 14:02 ?
    I'm trying to implement something similar on an interesting dataset and this is a great tutorial!!

  • @andreamonicque8663
    @andreamonicque8663 8 месяцев назад +1

    Perfect!!!!!!!

  • @revathyb1663
    @revathyb1663 9 месяцев назад

    Great video. How are you taking into account the sequence in information while training the xgb model? Also, what method do you suggest while I deal with multiple time series, meaning say for example I have energy consumption from multiple regions and would like to have predict for each region.

  • @user-cf5pf7on7k
    @user-cf5pf7on7k 7 месяцев назад

    Great video - you briefly mentioned stationarity in the beginning, but you didn't actually test for it. This data looks stationary to me, but if it wasn't would that cause a problem? Or is that only an issue with ARIMA models? Thanks!

  • @muhammadkashif7263
    @muhammadkashif7263 Год назад +1

    Amazing season ❤

  • @Lnd2345
    @Lnd2345 Год назад +1

    Great video, thanks.

    • @robmulla
      @robmulla  Год назад +1

      Glad you liked it! Thanks for the feedback.

  • @a.a.elghawas
    @a.a.elghawas 11 месяцев назад +1

    Cool video Rob!

    • @robmulla
      @robmulla  11 месяцев назад

      Thanks for watching!

  • @Mvobrito
    @Mvobrito Год назад +4

    Great video!
    If the goal was prediction only, and not inference (meaning you don't care about what's driving the energy consumption), you can the energy consumption of the previous days as feature for the model.
    When predicting consumption at T, you can use T-1, T-2, .. T-x.
    And even a moving average as feature as well.

    • @robmulla
      @robmulla  Год назад +1

      I totally agree! It all depends on how far in the future (forecasting horizon) you are attempting to predict.

  • @legenddairy8346
    @legenddairy8346 8 дней назад

    Thanks!

  • @THE8SFN
    @THE8SFN Год назад +1

    great tutorial

  • @JyotishmanHazarika-hs3ku
    @JyotishmanHazarika-hs3ku 2 месяца назад

    GOATED

  • @datalyfe5386
    @datalyfe5386 Год назад +1

    Just came across your channel, awesome content!

    • @robmulla
      @robmulla  Год назад

      Welcome aboard! Glad you like it.

  • @mirror1023
    @mirror1023 11 месяцев назад +1

    Amazing video

  • @santinonanini6107
    @santinonanini6107 7 месяцев назад +1

    Should you not split the training data into train and validation sets, such that you can use validation set instead of test set during training ? (when you use "eval_set" parameter ?)

  • @gui250493
    @gui250493 Год назад +1

    Well done!

  • @datasciencesolutions2361
    @datasciencesolutions2361 Год назад +1

    Great job sincerely!

    • @robmulla
      @robmulla  Год назад

      Thanks for the feedback!

  • @adityaghai220
    @adityaghai220 3 месяца назад

    amazing video

  • @AisyahAthifa
    @AisyahAthifa Год назад +1

    Nice tutorial 👍

  • @AQ-jh5fr
    @AQ-jh5fr Год назад +1

    Nice tutorial and when you said quick tutorial you sure meant it xD, I had to pause like a 100 times. but still thanks for the video

    • @robmulla
      @robmulla  Год назад

      Glad you liked the video. I'd rather it be too fast than too slow :D - you can always slow down the playback speed if that helps.

  • @cm3462
    @cm3462 8 дней назад

    Lovely

  • @dreamphoenix
    @dreamphoenix Год назад +1

    Thank you.

  • @magicdimension6073
    @magicdimension6073 Год назад

    Have you tried SARIMA Models for time series forecasting? I'm curious which perform better. Excelent content Rob!

  • @mohamednedal
    @mohamednedal 7 месяцев назад

    Hi Rob, Great tutorial! Could you please make a tutorial on how to use Shapley values to interpret LSTM models for timeseries forecasting?

  • @ErikaGPF
    @ErikaGPF Год назад +2

    Hi, thanks for the video! Pretty good! I have a question, wouldn't improve your model to use the actual 'PJME_MW' as input? It's a honest question, it is because I saw in other examples for timeseries forecasting that uses the metric you wanna predict as input as well. Thank you!

    • @robmulla
      @robmulla  Год назад

      Great question! If you use the actual value for a future time step you would be leaking information. Check out my part 2 video where I talk about the forecasting horizon. Hope that helps!

  • @marceelrf
    @marceelrf Год назад +1

    Great!!!

  • @wazzadec16
    @wazzadec16 Год назад +5

    FYI for anybody who is doing this recently. The part where combing training set and test set graphic and using a dotted line has to be modified.
    Before: '01-01-2015'
    After
    ax.axvline(x=dt.datetime(2015,1,1)
    Since matplotlib now needs it in a datetime series. I guess because of changing the index to a t0_datetime format?

    • @shrunkhalawankhede2611
      @shrunkhalawankhede2611 7 месяцев назад

      from datetime import datetime
      ax.axvline(x=datetime(2015,1,1), color='black', ls='--')

  • @neclis7777
    @neclis7777 Год назад +2

    Excellent video ! For weather, I suggest you look into HDD and CDD (heating degree days and cooling degree days) which focus on the amount of heating and cooling rather than the mean temperature.

    • @robmulla
      @robmulla  Год назад +2

      Thanks for the tips! I'm not familiar with those but I will look into it. The one main issue I see when people are training forecasting models like this is using the ground truth weather for future dates- which are not available at the time of prediction. That's why I think it's best to use forecast values from the historic dates.

  • @vlplbl85
    @vlplbl85 Год назад +1

    Great video. Don't you think adding time lags would increase performance the most? In my time series forecasts I find them very helpful, especially with seasonality.

    • @robmulla
      @robmulla  Год назад +1

      Hey Vladimir - thanks for the feedback. You are correct, lag variables can be very helpful. You need to remember, however that your lag variables can not be shorter than your forecasting horizon. So if you add a 1 week lag variable, then your model would not be able to predict further than 1 week. 1 year lags can be very helpful though.

  • @abinsharafm.s5168
    @abinsharafm.s5168 Год назад +1

    I follow you on twitch.. you should definitely do a video on how you setup your system for data science ( I mean you had Linux working with your ide and you were pulling data from the websites (api).. i found that very cool ! )

    • @robmulla
      @robmulla  Год назад +2

      Oh. Great idea! I’ve thought about doing this but need to think more about how to best explain my setup.