Multiple Time Series modeling using Apache Spark and Facebook Prophet

Поделиться
HTML-код
  • Опубликовано: 18 сен 2024
  • #datascience #machinelearning #timeseries
    This video is part of Time Series playlist here - • Time Series Modelling ...
    One major challenge with time series in real world is dealing with multiple time series, Be it retailers who have millions of product and every product having different sales cycle or manufacturing industry dealing with hundreds of machinery. In such cases we need systems and solution that can help distribute time series model building across distributed nodes to enable high parallelism. In this video we will see how we can use facebook prophet to model and Apache Spark to distribute across multiple nodes

Комментарии • 25

  • @jamespaz4333
    @jamespaz4333 2 года назад

    I saw your previous video about Prophet, this version is insane. It ran in a matter of mili-seconds! 😃

  • @user-mm6qr2jz8q
    @user-mm6qr2jz8q 6 месяцев назад

    You are a gem !

  • @cuttell2000
    @cuttell2000 Год назад

    Returning back a year later. Thank you

  • @Karenshow
    @Karenshow 2 года назад

    Hello, this is a great tutorial where you have multiple products and one model. Could you please do a tutorial of multiple products testing multiple models?
    Thanks

  • @user-gd6xu2et1l
    @user-gd6xu2et1l 4 года назад +1

    Thank you very much for this video !

  • @giacomoferrari9408
    @giacomoferrari9408 3 года назад +2

    Thanks for the video first of all.
    I would like to ask if I am using such models for timeseries analysis for financial data, how to constantly update the data and retrain the model to avoid data drift.

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад

      Retraining in case of this is like rebuilding new models as incremental training might not work. So if we see drift we take old data and add new and train it again

  • @sumankumar5126
    @sumankumar5126 2 года назад

    Thank you so much for this video.

  • @user-gd6xu2et1l
    @user-gd6xu2et1l 3 года назад +1

    @AIEngineering, can you tell me, how to run multiple time series using SARIMAX or XGBoost with Pyspark? Can you please also recommend any literature about multiple time series forecasting ?

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад +1

      For XGBoost you might anyway convert to rows for each time series and use it. You can check my pyspark-xgboost video to model ML which is same since we are converting multi time series to multiple observations to model
      For SARIMAX you can follow same approach as this video but instead of facebook you need to use statsmodel or any package you are using to model SARIMAX

  • @vinodsawant7835
    @vinodsawant7835 3 года назад +1

    @AIEngineering, Thanks a lot for the video, it will help me in my current project.
    Can we save these models and use them for prediction in spark itself? I will be trying it anyway, but your view on it would be much appreciated.

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад +1

      Yes you can. You can change the function that creates prophet model to load different models and inference on it

  • @vigneshwart2203
    @vigneshwart2203 3 года назад +1

    Hi Srivatsan,
    I noticed one small issue in pandas udf code. We want to do sort values based on date column inside pandasudf as we do groupby. When we do groupby and apply pandasudf function, it will jumble the order of data insted of sequence data per ts.

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад +1

      Vigneshwar.. Facebook prophet takes date as a column and it orders it internally before fitting the model. So in this case it will not be a issue but if we are using model that requires sorting then yes we might have to order by internally and feed it. Are you seeing any issue and it is not getting sorted?

    • @vigneshwart2203
      @vigneshwart2203 3 года назад +1

      @@AIEngineeringLife I haven't used fb prophet. But other stats model requires data to be sorted. Is there any way to validate or documentation available data is sorted while fitting fb prophet?

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад +1

      @@vigneshwart2203 You can check this issue tracker response in FB prophet git repo - github.com/facebook/prophet/issues/1412

  • @marcoaureliodefariaborges3362
    @marcoaureliodefariaborges3362 Год назад

    how to get the trained model from spark tasks so we can predict later without need new training?

  • @aradhnasingh2157
    @aradhnasingh2157 4 года назад

    Hello Sir I am working on Solar Energy Time Series(5 min granulity) where I have these night time values of energy as zero's. The values only appear during day time. I applied Facebook prophet models on such data the results are not getting better.If I remove those values prophet still won't give satisfactory results. Do you recommend any good time series model for such data?

    • @AIEngineeringLife
      @AIEngineeringLife  4 года назад

      Can you tell me what result you got after removing the events. Did u create future dataframe only for the time events were available during training the model?.

    • @aradhnasingh2157
      @aradhnasingh2157 4 года назад

      @@AIEngineeringLifeRemoval of night time values were for both test and train.In future data frame the time periods streched out to some weird points(I defined the period and frequency as given for test).

    • @AIEngineeringLife
      @AIEngineeringLife  4 года назад

      Is the data available in open domain that I can try?.. Do you see any pattern in data that varies by other factors like temp.. If it is constant or like white noise it might not be easy to model

  • @Chgm2010
    @Chgm2010 3 года назад +1

    where can i found this notebook to download ?

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад

      Here - github.com/srivatsan88/End-to-End-Time-Series

    • @Chgm2010
      @Chgm2010 3 года назад

      @@AIEngineeringLife thx!