Multiple Time Series Forecasting With Scikit-Learn

Поделиться
HTML-код
  • Опубликовано: 6 июл 2021
  • You got a lot of time series data points and want to predict the next step (or steps). What should you do now? Train a model for each series? Is there a way to fit a model for all the series together? Which is better?
    I have seen many data scientists think about approaching this problem by creating a single model for each product. Although this is one of the possible solutions, it's not likely to be the best.
    Here I will demonstrate how to train a single model to forecast multiple time series at the same time. This technique usually creates powerful models that help teams win machine learning competitions and can be used in your project.
    And you don’t need deep learning models to do that!
    Timestamps
    0:00 Intro
    1:28 Melt the data, stack the series
    7:18 Split the data
    10:29 Set-up a 1-step target
    13:57 Create 4 fundamental features (feature engineering)
    26:16 Choose an evaluation metric
    31:34 Establish a baseline
    35:18 Train the model
    37:34 Evaluate the model
    39:11 Extend the model to multi-step forecasting
    43:04 Forecast new data
    45:37 Next steps
    Code: github.com/ledmaster/english_...
    Timestamps:
    0:00 Intro
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // SUPPORT THE CHANNEL 👇❤️
    Sign up for a Coursera course:
    imp.i384100.net/EaDmQe
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // SOCIAL MEDIA
    LinkedIn: / mariofilho
    Kaggle: kaggle.com/mariofilho
    Twitter: / mariofilhoml
    Blog: forecastegy.com
    Some links above can be from partnerships where I get a commission if you buy a product, without any additional cost to you. Thanks for the support!

Комментарии • 42

  • @nehan.2199
    @nehan.2199 2 года назад +1

    This is very helpful thank you! Where can I find the dataset to download?

    • @Forecastegy
      @Forecastegy  2 года назад +1

      Great, here it is: archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @tom199520000
    @tom199520000 7 месяцев назад +3

    I just checked this amazing video after your feature selection engineering video! I have no idea why this is video isn’t popular!!! Respect the effort you spent on this!

  • @towhidultonmoy3046
    @towhidultonmoy3046 2 года назад

    Keep it up! You have a long way to go brother. Best wishes!

  • @Luckasborges
    @Luckasborges 2 года назад +2

    Learning ML and English together! Here we go! hehe
    Congrats for the new channel, Mario!

  • @Septumsempra8818
    @Septumsempra8818 Год назад

    Are we going to get a video on cross-validation and selecting the right model?
    Your time series videos have been a wealth of knowledge.

  • @sancarlitos1125
    @sancarlitos1125 2 года назад

    Excellent explanation! Thanks for sharing it! I was realizing a similar forecasting, and I was wondering if when product number changes, let say from 0 to 1… the rolling window and the lag should be modified? Because we would be using the information of the last product.
    Thank you very much!

  • @ElChe-Ko
    @ElChe-Ko Год назад

    Nice! It would be interesting to see what to do if the time series have different lengths.

  • @Dragnar21
    @Dragnar21 2 года назад

    First of all, thank you for that video and that extraordinary explanation. I would like to know how would you structure your data, if the data is not the same length ?

  • @kaianchan7768
    @kaianchan7768 2 года назад

    Thanks for this tutorial. Will you provide some videos about many features? Thanks!

  • @pcdowling
    @pcdowling 8 месяцев назад

    Thank you.

  • @diegosccp09
    @diegosccp09 2 года назад +1

    you are a legend Im using this to do a masters assessment

  • @igorkuivjogifernandes3012
    @igorkuivjogifernandes3012 2 года назад

    Hi, Mario. Awesome video...it helped me a lot. One doubt: what could we do if the train set has uneven peridiocity (the peridiocity is 2 days for one product, 7 days for another product, 3 days for another product and so or even worst...some products has only 1 or 2 observations), but my test set has even peridiocity (every product has peridiocity of 7 days)?

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 2 года назад +1

    Vc é o cara!

  • @alirezajabbari2537
    @alirezajabbari2537 2 года назад +2

    Thank you Mario!
    You saved me in my 4th year project
    ciao

  • @anwarsaidan3959
    @anwarsaidan3959 12 дней назад

    Thank you very much for this amazing video !
    Can we use Cross Validation for hyperparameter tuning in the case of RandomForest with time series data ?

  • @faraza5161
    @faraza5161 2 года назад +1

    The Simple Imputer will impute mean values for the entire column in the missing values. Shouldn't that be done product wise as well?
    Thanks for a wonderful lecture btw :-)

  • @Mohammad-vr9dj
    @Mohammad-vr9dj Год назад

    Thanks for the useful video. Sorry, is it possible to implement independent spatial sequences simultaneously? I have a dataset which is consist of 1000 independent spatial sequences with dimension 2*7 (2 for x and y, and the length 7 for positions in each time). I implemented it with Simple RNN, LSTM and GRU. Can I do it with transformers (attention mechanism)? Could you introduce me a practical example?

  • @Orlandobelli
    @Orlandobelli 2 года назад

    Good video, we can make multiples time series with ARIMA model?

  • @Mohammad-vr9dj
    @Mohammad-vr9dj Год назад

    Thanks for your useful video. Sorry, If our dataset has two target columns how can we write the codes?

  • @vamsikrishnabhadragiri9742
    @vamsikrishnabhadragiri9742 2 года назад +4

    Why haven't perform standardization for the data? As sales for different products will be different ranges does it not affect the model performance?

  • @zulhas9
    @zulhas9 Год назад

    Hi Mario, thanks for the wonderful presentation. One qouestion, how could you use the feature the "Sales" to predict sales? Using that features, when you predict using .predict function, you have to pass that as an argument. In reality, you would not have that information available.

  • @user-fh7gb2yf5z
    @user-fh7gb2yf5z Год назад

    Mario, boa tarde. Tem algum dica para usarmos a LSTM para predições com passos à frente em um sistema MISO? .

  • @Learner_123
    @Learner_123 Год назад +4

    Thank you for making the topic simple. Since you have combined all the product sales to train and validate your model, How can one use this model to predict sales for 'any single' product only?

    • @zabmaz10
      @zabmaz10 Год назад

      I have the same question, but I guess one way is to convert the product code into dummy variables and use those as features in the random forest.

  • @StatiR_br
    @StatiR_br 2 года назад +3

    Olá Mario! Em primeiro lugar parabéns pelo vídeo ! Fiquei com uma dúvida: Nesse contexto, temos vários produtos (Product_Code) e apenas um modelo ajustado, da forma que está o dataset, o modelo irá/poderá considerar, por exemplo, o último 'lag_sales_1' de um Product_Cod para prever as vendas do próximo Product_Code ? Pois o modelo não saberá quando é um Product_Code e quando será outro. Ou eu estou confundindo? Desde já obrigado !

    • @guilhermeparreira5448
      @guilhermeparreira5448 Год назад

      Concordo contigo. Essa forma de modelagem só funcionaria se todos os produtos tivessem uma venda média próxima (e olha lá). Penso que o mais correto seria o product code também como covariável do modelo.

  • @mamyrak1114
    @mamyrak1114 Месяц назад

    i can do the same processus if in place of week i have a date like yyyy-mm-dd and how to handle the year?

  • @jackcarter97
    @jackcarter97 5 месяцев назад

    how do I find the season effect features?

  • @jackcarter97
    @jackcarter97 5 месяцев назад

    How do I find the season effect features?

  • @Gabriel-iw3hc
    @Gabriel-iw3hc Год назад

    how i future forecast with this method ?
    Ex: forecast week 52 ?
    i think, need to forecast another series too for another features
    .

  • @stonesupermaster
    @stonesupermaster Год назад +2

    Hello Mario, I have a question... how does the model know that we're trying to predict multiple products at once? I've trying to train a model in order to predict the sales of 2000 SKU and the main concern I have now is how to do it efficiently. I watched everything that you did but I still have the same problem, do you know where I can find an example of it? thank you very much for your video

    • @AskApt05
      @AskApt05 24 дня назад

      Hi @stonesupermaster, Facing same problem. Have you found a solution? It would be really helpful if you can share. Thanks!

  • @VG-yw2mp
    @VG-yw2mp Год назад +1

    Why dont we use product_code as one of the features while training?

  • @ozan4702
    @ozan4702 2 года назад

    Why the difference should be a feature? Given sales and lag sales, difference can be already known.

  • @XiboquinhaMilGrau
    @XiboquinhaMilGrau Год назад +1

    Por essa eu não esperava kkkk

  • @efremyohannes2334
    @efremyohannes2334 2 года назад

    How to model time series for unevenly distributed data using sckit-learn

  • @aacharyadhruvi8301
    @aacharyadhruvi8301 2 года назад

    From where I can get Sales_Transactions_Dataset_Weekly.csv ?

    • @Forecastegy
      @Forecastegy  2 года назад

      Here archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @vivianealveslima9358
    @vivianealveslima9358 2 года назад +1

    the code in GitHub is unavailable =S

    • @Forecastegy
      @Forecastegy  2 года назад

      Oops! Fixed, this is the right link: github.com/ledmaster/english_tutorials/tree/main/multiple_time_series