sales forecasting with Prophet (data science deep-dive project part 1)

Поделиться
HTML-код
  • Опубликовано: 19 июн 2024
  • #30daysofdata A full end-to-end machine learning project, data processing + cleaning, timeseries modeling with the Prophet model, and information on how I think about building ML pipelines out! I go into detail about my thought processes and all of the code for the timeseries Prophet model in a shareable jupyter notebook and have links below regarding fourier sums, time series modeling, types of time series, and for the data downloads! Join me for the 30 days of data series and learn how to think like a Data Scientist and get the right resources to learn about building your own end-to-end data science projects!
    Other videos you'll like!!!
    exactly what I do as a Data Scientist | 2.5 years of projects + roles • exactly what I do as a...
    how to learn Data Science from scratch in 2023 • how to learn Data Scie...
    real talk about my Data Scientist jobs + salary for entry level data science • real talk about my Dat...
    day in the life of a Data Scientist in Chicago • day in the life of a D...
    day in the life of a Data Scientist at a tech start-up • day in the life of a D...
    Why I Became a DATA SCIENTIST as a Physics Major • Why I Became a DATA SC...
    Jupyter Notebook Follow-Along: github.com/priyalingutla/30-D...
    Kaggle Dataset Link: www.kaggle.com/competitions/s...
    Link for Timeseries From Scratch: towardsdatascience.com/time-s....
    Prophet Documentation: facebook.github.io/prophet/do..., facebook.github.io/prophet/do...
    Timestamps:
    00:00 hello 🔅
    01:34 timeseries forecasting 📚
    02:08 deep dive 💡
    -------------------------------------------------------------------------------------------------------------------------------------------
    Welcome to my channel - College Tips From the Almost Astrophysicist! I'm Priya and I'm here to help you get into college. I'm a University of Chicago grad with an Astrophysics degree that currently works as a Data Scientist and I want to break down the college application process and tackle all of the misconceptions about college for you! Let me know in the comments section down below if you have any video requests, or just want to say hi! :)
    LinkedIn: / priya-l-520311145
    Instagram: / plingutla

Комментарии • 67

  • @andracoisbored
    @andracoisbored 7 месяцев назад +1

    Looking forward to all your videos!

  • @lilcameauxx
    @lilcameauxx 10 месяцев назад +10

    Having started my degree in astrophysics and then deciding about halfway through i wanted to do data science, your channel has been a gold mine! I graduate next spring with my degree in Data Science. You have been a large part of my learning and i thank you!!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад +1

      So glad I can be a part of your journey! Sounds so similar to mine haha

  • @evedickson2496
    @evedickson2496 6 месяцев назад +4

    Fantastic video.. exactly what I've been looking for.. Will be in corporating this into or forecasting workflow.. thank you 😊.. subscribed and will be watching the full 30 days 😊

  • @edgarromeroherrera2886
    @edgarromeroherrera2886 6 месяцев назад +2

    Thank you so much for this amazing video, it's so pretty useful. Not enought words to thank you

  • @AlanGaugler
    @AlanGaugler 2 месяца назад +2

    An excellent introduction to time-series forecasting and FB prophet, very well explained and well writen code. I will be watching many more of your videos :)

  • @michaelwallendjack911
    @michaelwallendjack911 7 месяцев назад +2

    As a newbie to FB prophet these 2 tutorials rock! Very easy to follow along and digest. Are you planning on releasing the 3rd part of the series any time soon? Excited to watch!

  • @vinayakjadhav5553
    @vinayakjadhav5553 10 месяцев назад +2

    first of all thank u for giving the information of data science and take out us to the real world data science word
    course

  • @aj-hz2yq
    @aj-hz2yq 10 месяцев назад +1

    Thank you for making these

  • @LACERDAJO
    @LACERDAJO 3 месяца назад

    Hello Priya! I am a new follower of yours here and I new fan as well! Congratulations! This explanation is beautiful!

  • @WhaleJetski
    @WhaleJetski 11 месяцев назад +1

    Been waiting for this series from you! Thank you!!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  11 месяцев назад +1

      Of course! More to come with the series, thanks for following along! 😀

  • @malcomharris6642
    @malcomharris6642 10 месяцев назад +6

    I majored in Mathematical Economics in undergrad and graduated in fall of 2020. I'm currently going to grad school for Data Science in healthcare analytics. This channel really helps!!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад +1

      That's awesome - good luck on your journey and glad the channel can be a part of it!

  • @user-st1ov8bm9j
    @user-st1ov8bm9j 12 дней назад

    thank you for sharing. This is very informative!

  • @ArmPowerWorkouts
    @ArmPowerWorkouts 2 месяца назад +1

    Fantastic density of the content.

  • @mpfiesty
    @mpfiesty 4 месяца назад

    This is great content, thank you.

  • @ianperkins8812
    @ianperkins8812 11 месяцев назад +4

    For me, your timing is absolutely spot on - I am sitting for Microsoft DP-100 in three weeks and starting a machine learning class the week after that, so THANK YOU! I can't wait for the next installment :)

  • @danymerizalde1942
    @danymerizalde1942 5 месяцев назад

    It is an amazing video!

  • @statisticallylaura
    @statisticallylaura 11 месяцев назад +2

    The timing on this is absolute gold, this is literally the type of project I'm building as a Django app for my work right now! It's an analytics dashboard to monitor sales activity by channel and since we're dealing with a lot of seasonality there, Prophet seems like a spot-on fit for incorporating forecasts. Thank you for doing this, excited for more of the series!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад +1

      Ahhhh this makes me so happy! Incorporating DS to solve business problems for the win!

  • @DEDE-ix9lg
    @DEDE-ix9lg 11 месяцев назад +4

    this series will be FIRE 🔥🔥🔥

  • @madhavilingutla4031
    @madhavilingutla4031 11 месяцев назад +1

    Good One!

  • @andyberrios5572
    @andyberrios5572 11 месяцев назад +1

    Middle of doing my Stats 5301 hw.. can’t wait to finish up and get into this vid!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  11 месяцев назад +1

      Means a lot that you're following along, thank you!! Hope this helps 😀

  • @jpiantoni-5861
    @jpiantoni-5861 4 месяца назад

    Woow, amazing class, thank you (from Brazil)

  • @sai251180
    @sai251180 11 месяцев назад +1

    Thank you for this productive video! Learnt a lot!!

  • @franciscotrejo8168
    @franciscotrejo8168 10 месяцев назад +3

    This is great! Just started the video, cool to see another time series forecasting model. I have primarily used the Nixtla forecasting libraries like Neuralforecast and Statsforecast. Excited to see another approach! Keep up the good work!

    • @franciscotrejo8168
      @franciscotrejo8168 10 месяцев назад +1

      Just finished the video - great work! I really enjoyed how you walked through all the aspects of the code and even re-ran some cells to really help explain what is going on. Excited to see the rest of this series, keep it up!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад +1

      Thanks for watching! That’s awesome, I’ll have to check those libraries out! I’ve primarily used Prophet because it felt so easy to use and also explain to stakeholders haha. Appreciate you keeping up with the series!

  • @gralleg9634
    @gralleg9634 10 месяцев назад

    Thanks a lot from France 👌

  • @herculesgixxer
    @herculesgixxer 2 месяца назад

    You’re amazing

  • @miguelbohorquezgranados1207
    @miguelbohorquezgranados1207 11 месяцев назад +1

    Quality content as always!

  • @user-ej1ip3iq1o
    @user-ej1ip3iq1o 8 дней назад

    Many thanks for the super great video!
    I would like to know why you have loaded holidays, but they are not (or cannot be) used by Prophet later?

  • @albertowusu-banie154
    @albertowusu-banie154 10 месяцев назад +1

    @TheAlmostAstrophysicist - Thanks for this. Currently working on a forecasting model and this video came in right on time.
    Looking forward to the next videos.
    I also studied Physics, by the way 😄

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад

      That’s awesome! Glad the video can help, thanks for following along! also so fun that you did physics too!

  • @andrewchen2590
    @andrewchen2590 11 месяцев назад +1

    Super excited to start this!

  • @anthonyshea6048
    @anthonyshea6048 7 месяцев назад

    Can you please do a video on predicting discrete yes or no events in a time series using only categorical data?? That would be immensely helpful. I’m approaching feature selection with mutual information classification, but I’d like to know how you’d pipeline it!

  • @simbarashemutyambizi1360
    @simbarashemutyambizi1360 10 месяцев назад

    Still new to ds, but will your videos. I dont really understand eda, its purpose in the end and how to use your findings in eda for the followng processes in ds cycle. If you could make a video on it, in this series with an simple example case study, I would appreciate it.

  • @makalamabotja4773
    @makalamabotja4773 10 месяцев назад

    HI Priya, I love the video series idea. I'm currently in sales and looking to propose a sales forecasting pipeline at work as an audition to transition to a full time position. I love the video and still trying to get head around the coding itself.
    Keep up the good work and I look forward to more in your series

    • @makalamabotja4773
      @makalamabotja4773 10 месяцев назад

      I have a question related to this video series and perhaps a request. As mentioned, I'm trying to make a forecasting proposal for my workplace and would like to cover all the basis that would be applicable from a data science perspective
      I built an RFM and CLTV customer segmentation Kmeans model based off e-commerce data from Kaggle and wanted to use these clustering to make forecasting prediction based off leads received and classified into the identified clusters. I will be forecasting total sales for the month using regression and wanted to know if this is something you would be doing on a day to day as a data scientist in a sales environment or am I missing a step?

  • @TheMiguel710
    @TheMiguel710 11 месяцев назад +1

    I am starting out in DS (around a year into it) and I am really inspired by your content. Never used prophet but will make sure to run your notebook and accompany the series! Just curious, how long does it take you to make something like this notebook? I am struggling to execute faster and was wondering if you have any tips on that?
    Great content as always!

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  11 месяцев назад

      Awesome! To make the notebook, took about I'd say 20-30 minutes since I've worked with prophet before! The hardest part was honestly finding good open source data lol. And the whole notebook takes about 20ish minutes to run if you go through the whole hypertuning cross-validation for every category of products! I have that notebook pipeline for video 2 finished!

  • @krishnarao4840
    @krishnarao4840 10 месяцев назад +1

    Useful information

  • @MQ2011de
    @MQ2011de Месяц назад

    USE
    df_cv = cross_validation(m, initial='365 days', period='30 days', horizon = '30 days', parallel='threads')
    INSTEAD OF
    df_cv = cross_validation(m, initial='365 days', period='30 days', horizon = '30 days', parallel='processes')
    IF YOU HAVE A OLD COMPUTER.

  • @donndonnn
    @donndonnn 2 месяца назад +4

    You stopped uploading??? Nooooo

    • @isaiahindigenousaboriginal5261
      @isaiahindigenousaboriginal5261 24 дня назад

      I know but get hEr ( side note ) she told everyone to pleAse engage. Did everyone obliGe???
      When the youth find her it’s a wrap.
      Ok I will show everyone how exciting she and this channel is. Y’all have no idea how you’re about to love learning again! Let’s gooooOoOo!

  • @BryanCoronel0303
    @BryanCoronel0303 11 месяцев назад +1

    this is nicely in-depth, thank you! in terms of scaling, would it be best to run this as a Python script instead of notebook and automate it using something like airflow?

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  11 месяцев назад

      Thanks for watching! Absolutely! So you'd want to fully automate it as a pipeline, my second video (coming out Saturday this week) is a second full pipeline notebook and you'd want sometime like that pipeline either automated as a script OR you can use a service like Databricks/something similar to schedule regular notebook runs/jobs. :)

  • @Ana-to3hi
    @Ana-to3hi 4 месяца назад

    Please make more content ❤

  • @niallwhelan2648
    @niallwhelan2648 11 месяцев назад +1

    Great series, thanks. Just on high volume you refer to largest sales by day, but is transaction volume not more important than total daily sales? High transaction volume will give better signal than low transaction volume.

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад

      Great question! Absolutely - I think what you define as transaction volume is what I'm referring to when I say total daily sales. i.e. the higher the transactions are daily/the higher the volume, the better signal we get.
      In the video, I use the "np.mean" function across the columns to see what the average daily sales (i.e. avg. transaction volume) is. In general, the lower the volume, (under $1000 usually) leads to higher errors since it's hard to get signal. So I use >=$1000 as a cut-off.
      Does this make sense? I think we mean the same thing haha

  • @zaccanasta27
    @zaccanasta27 21 день назад

    Would you consider this logic valuable also for LTV calculation, where instead of categories (such as automotive, babycare, beauty ...) we have cohort months (such as Jan-23, Feb-23 ...)?

  • @user-qe9hx1uj4l
    @user-qe9hx1uj4l 3 месяца назад

    Is there a way to deal with having lots of 0s in the time series? I'm currently working on a procurement forecast model. Therefore there are lots of days where procurement doesn't happen, making the y value 0 for most days. This is really affecting the model performance.

  • @Alice8000
    @Alice8000 2 месяца назад

    Great video. Is it ok just to leave some troll comments/questions?

  • @user-gl7vp8ne8y
    @user-gl7vp8ne8y 10 месяцев назад

    Thank you for this series, when i downloaded the dataset from kaggel it didn't downloaded right

    • @TheAlmostAstrophysicist
      @TheAlmostAstrophysicist  10 месяцев назад

      Hmm that's weird. I download the "train.csv" from www.kaggle.com/c/favorita-grocery-sales-forecasting and I renamed it on my desktop to "store_data.csv" Maybe that's the issue if you can't read in the data?

  • @observer698
    @observer698 29 дней назад

    Why 1-MAPE as the accuracy metric?

  • @alazaraddis7237
    @alazaraddis7237 11 месяцев назад

    me struggling to change my major from IS to CS🤣🤣🤣🤣🤣

  • @Derek-yf6pj
    @Derek-yf6pj Месяц назад

    I just found your channel, loved this video and subscribed. But looks like you stoped making content. Please come back, I like the way you give a background on the items discussed, like the Fourier math etc. 🫶