Multiple Time Series Forecasting With Scikit-Learn

Поделиться
HTML-код
  • Опубликовано: 27 сен 2024
  • You got a lot of time series data points and want to predict the next step (or steps). What should you do now? Train a model for each series? Is there a way to fit a model for all the series together? Which is better?
    I have seen many data scientists think about approaching this problem by creating a single model for each product. Although this is one of the possible solutions, it's not likely to be the best.
    Here I will demonstrate how to train a single model to forecast multiple time series at the same time. This technique usually creates powerful models that help teams win machine learning competitions and can be used in your project.
    And you don’t need deep learning models to do that!
    Timestamps
    0:00 Intro
    1:28 Melt the data, stack the series
    7:18 Split the data
    10:29 Set-up a 1-step target
    13:57 Create 4 fundamental features (feature engineering)
    26:16 Choose an evaluation metric
    31:34 Establish a baseline
    35:18 Train the model
    37:34 Evaluate the model
    39:11 Extend the model to multi-step forecasting
    43:04 Forecast new data
    45:37 Next steps
    Code: github.com/led...
    Timestamps:
    0:00 Intro
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // SUPPORT THE CHANNEL 👇❤️
    Sign up for a Coursera course:
    imp.i384100.ne...
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // SOCIAL MEDIA
    LinkedIn: / mariofilho
    Kaggle: kaggle.com/mar...
    Twitter: / mariofilhoml
    Blog: forecastegy.com
    Some links above can be from partnerships where I get a commission if you buy a product, without any additional cost to you. Thanks for the support!

Комментарии • 44

  • @nehan.2199
    @nehan.2199 2 года назад +1

    This is very helpful thank you! Where can I find the dataset to download?

    • @Forecastegy
      @Forecastegy  2 года назад +1

      Great, here it is: archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @LifeKiT-i
    @LifeKiT-i 11 месяцев назад +3

    I just checked this amazing video after your feature selection engineering video! I have no idea why this is video isn’t popular!!! Respect the effort you spent on this!

  • @Luckasborges
    @Luckasborges 3 года назад +2

    Learning ML and English together! Here we go! hehe
    Congrats for the new channel, Mario!

  • @alirezajabbari2537
    @alirezajabbari2537 2 года назад +2

    Thank you Mario!
    You saved me in my 4th year project
    ciao

  • @diegosccp09
    @diegosccp09 2 года назад +1

    you are a legend Im using this to do a masters assessment

  • @towhidultonmoy3046
    @towhidultonmoy3046 2 года назад

    Keep it up! You have a long way to go brother. Best wishes!

  • @ElChe-Ko
    @ElChe-Ko Год назад

    Nice! It would be interesting to see what to do if the time series have different lengths.

  • @sarasatti1070
    @sarasatti1070 2 месяца назад

    Hi Mario, and thank you for a very clear and concise explanation. One question I have is, how would you handle it if several of the products are only selling intermittently such that there are many zeros in the series?

  • @Learner_123
    @Learner_123 2 года назад +4

    Thank you for making the topic simple. Since you have combined all the product sales to train and validate your model, How can one use this model to predict sales for 'any single' product only?

    • @zabmaz10
      @zabmaz10 2 года назад

      I have the same question, but I guess one way is to convert the product code into dummy variables and use those as features in the random forest.

  • @vamsikrishnabhadragiri9742
    @vamsikrishnabhadragiri9742 3 года назад +4

    Why haven't perform standardization for the data? As sales for different products will be different ranges does it not affect the model performance?

  • @kaianchan7768
    @kaianchan7768 2 года назад

    Thanks for this tutorial. Will you provide some videos about many features? Thanks!

  • @VG-yw2mp
    @VG-yw2mp Год назад +1

    Why dont we use product_code as one of the features while training?

  • @mamyrak1114
    @mamyrak1114 5 месяцев назад

    i can do the same processus if in place of week i have a date like yyyy-mm-dd and how to handle the year?

  • @jackcarter97
    @jackcarter97 8 месяцев назад

    How do I find the season effect features?

  • @faraza5161
    @faraza5161 2 года назад +1

    The Simple Imputer will impute mean values for the entire column in the missing values. Shouldn't that be done product wise as well?
    Thanks for a wonderful lecture btw :-)

  • @StatiR_br
    @StatiR_br 3 года назад +3

    Olá Mario! Em primeiro lugar parabéns pelo vídeo ! Fiquei com uma dúvida: Nesse contexto, temos vários produtos (Product_Code) e apenas um modelo ajustado, da forma que está o dataset, o modelo irá/poderá considerar, por exemplo, o último 'lag_sales_1' de um Product_Cod para prever as vendas do próximo Product_Code ? Pois o modelo não saberá quando é um Product_Code e quando será outro. Ou eu estou confundindo? Desde já obrigado !

    • @guilhermeparreira5448
      @guilhermeparreira5448 2 года назад

      Concordo contigo. Essa forma de modelagem só funcionaria se todos os produtos tivessem uma venda média próxima (e olha lá). Penso que o mais correto seria o product code também como covariável do modelo.

  • @zulhas9
    @zulhas9 Год назад

    Hi Mario, thanks for the wonderful presentation. One qouestion, how could you use the feature the "Sales" to predict sales? Using that features, when you predict using .predict function, you have to pass that as an argument. In reality, you would not have that information available.

  • @Gabriel-iw3hc
    @Gabriel-iw3hc Год назад

    how i future forecast with this method ?
    Ex: forecast week 52 ?
    i think, need to forecast another series too for another features
    .

  • @RodrigoLima-o5b
    @RodrigoLima-o5b Год назад

    Mario, boa tarde. Tem algum dica para usarmos a LSTM para predições com passos à frente em um sistema MISO? .

  • @sancarlitos1125
    @sancarlitos1125 2 года назад

    Excellent explanation! Thanks for sharing it! I was realizing a similar forecasting, and I was wondering if when product number changes, let say from 0 to 1… the rolling window and the lag should be modified? Because we would be using the information of the last product.
    Thank you very much!

  • @Mohammad-vr9dj
    @Mohammad-vr9dj 2 года назад

    Thanks for your useful video. Sorry, If our dataset has two target columns how can we write the codes?

  • @Orlandobelli
    @Orlandobelli 2 года назад

    Good video, we can make multiples time series with ARIMA model?

  • @Dragnar21
    @Dragnar21 2 года назад

    First of all, thank you for that video and that extraordinary explanation. I would like to know how would you structure your data, if the data is not the same length ?

  • @Mohammad-vr9dj
    @Mohammad-vr9dj Год назад

    Thanks for the useful video. Sorry, is it possible to implement independent spatial sequences simultaneously? I have a dataset which is consist of 1000 independent spatial sequences with dimension 2*7 (2 for x and y, and the length 7 for positions in each time). I implemented it with Simple RNN, LSTM and GRU. Can I do it with transformers (attention mechanism)? Could you introduce me a practical example?

  • @lebesgue-integral
    @lebesgue-integral 2 года назад

    Hi, Mario. Awesome video...it helped me a lot. One doubt: what could we do if the train set has uneven peridiocity (the peridiocity is 2 days for one product, 7 days for another product, 3 days for another product and so or even worst...some products has only 1 or 2 observations), but my test set has even peridiocity (every product has peridiocity of 7 days)?

  • @efremyohannes2334
    @efremyohannes2334 3 года назад

    How to model time series for unevenly distributed data using sckit-learn

  • @aacharyadhruvi8301
    @aacharyadhruvi8301 2 года назад

    From where I can get Sales_Transactions_Dataset_Weekly.csv ?

    • @Forecastegy
      @Forecastegy  2 года назад

      Here archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @anwarsaidan3959
    @anwarsaidan3959 3 месяца назад

    Thank you very much for this amazing video !
    Can we use Cross Validation for hyperparameter tuning in the case of RandomForest with time series data ?

  • @jackcarter97
    @jackcarter97 8 месяцев назад

    how do I find the season effect features?

  • @necuspam
    @necuspam 2 месяца назад +1

    More intriguing question is: how to train a model, based on thousands of timeseries, determined by multiple parameters, and then to simulate/forecast single timeseries, based on new set of the respective parameters

  • @stonesupermaster
    @stonesupermaster Год назад +2

    Hello Mario, I have a question... how does the model know that we're trying to predict multiple products at once? I've trying to train a model in order to predict the sales of 2000 SKU and the main concern I have now is how to do it efficiently. I watched everything that you did but I still have the same problem, do you know where I can find an example of it? thank you very much for your video

    • @AskApt05
      @AskApt05 4 месяца назад

      Hi @stonesupermaster, Facing same problem. Have you found a solution? It would be really helpful if you can share. Thanks!

  • @VamosCoringar
    @VamosCoringar 2 года назад +1

    Por essa eu não esperava kkkk

  • @ozan4702
    @ozan4702 2 года назад

    Why the difference should be a feature? Given sales and lag sales, difference can be already known.

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 2 года назад +1

    Vc é o cara!

  • @Septumsempra8818
    @Septumsempra8818 Год назад

    Are we going to get a video on cross-validation and selecting the right model?
    Your time series videos have been a wealth of knowledge.

  • @pcdowling
    @pcdowling Год назад

    Thank you.

  • @vivianealveslima9358
    @vivianealveslima9358 3 года назад +1

    the code in GitHub is unavailable =S

    • @Forecastegy
      @Forecastegy  3 года назад

      Oops! Fixed, this is the right link: github.com/ledmaster/english_tutorials/tree/main/multiple_time_series