That’s right Salil. Everything must be focused on the business problem. That guides the way we use data - everything from which sources we use to how we wrangle and add features.
This is quite an elegant solution! As a python user, gotta envy the toolsets available in the R ecosystem (e.g., skimr, modeltime). Wonder if there is good python equivalence out there.
There isn’t anything in Python yet that solves the panel data approach. Sktime for single series models. Note - I created Modeltime based on the Time Series Competitions and implementing their winning strategies. So I’ve been doing a ton of research. Python is very behind currently. Hopefully it will improve.
In the recipe you did some dummy encoding. I read in the manual of lightgbm and catboost, that you should avoid this, because it makes the models slower and confuses them contrary to xgboost models.
Yes, join my learning labs pro program. You get access to 76 labs with more coming. And you get support too. university.business-science.io/p/learning-labs-pro
Learning Labs Pro is what I recommend. We have 60+ labs including 5 on tidymodels. Plus new labs released every 2 weeks. university.business-science.io/p/learning-labs-pro
@@BusinessScience The example is your code. I added nothing new to it. Simply wrote it down from the video. Same packages, same data. There is no modification. Since RStudio crashes, there is no real error tracing available too. The only possible difference could be the R environment and OS.
Yes, but your data may be different. Your environment as well. Since I cannot reproduce, I can only speculate that something is amiss with the environment. I’d try installing latest versions of all software, make sure you have Rtools since you are on Windows, and see if there are any errors during software updates.
Several students have had issues with CatBoost, which could also be an issue. The solution was to install the previous version catboost.ai/docs/concepts/r-installation.html
Amazing Matt! Love the way you always frame the solution in the business context.
That’s right Salil. Everything must be focused on the business problem. That guides the way we use data - everything from which sources we use to how we wrangle and add features.
This is quite an elegant solution! As a python user, gotta envy the toolsets available in the R ecosystem (e.g., skimr, modeltime). Wonder if there is good python equivalence out there.
There isn’t anything in Python yet that solves the panel data approach. Sktime for single series models. Note - I created Modeltime based on the Time Series Competitions and implementing their winning strategies. So I’ve been doing a ton of research. Python is very behind currently. Hopefully it will improve.
Since xgboost provides function: 'slice' too, it makes sense to reference functions with there classes... dplyr::slice(1:10)
Absolutely - This is one of the reasons that I load tidyverse last. It's a common gotcha! Thanks for pointing this out.
In the recipe you did some dummy encoding. I read in the manual of lightgbm and catboost, that you should avoid this, because it makes the models slower and confuses them contrary to xgboost models.
You may be right. I’ve heard this with catboost.
is there anywhere I can access the full notebook of this code? amazing job you did.
Yes, join my learning labs pro program. You get access to 76 labs with more coming. And you get support too. university.business-science.io/p/learning-labs-pro
awesome tutorial...
Thank you!!
Hey Matt, does this approach help to tackle newly introduced products, those with very limited history?
It can help. Global modeling will take into account other time series. It’s also helpful to flag promotions during a produce launch.
Where can I find the links to the other videos in the Tidymodels series?
Learning Labs Pro is what I recommend. We have 60+ labs including 5 on tidymodels. Plus new labs released every 2 weeks. university.business-science.io/p/learning-labs-pro
Hello. Is there any way to populate lag columns with predicted values in future data frame?
Thanks brothers for the tutorials . I want to use Light GBM regression to predict cobalt recover and I want to use R please I how do I go about it.
Learn R. You’ll love it. mailchi.mp/business-science/rtrack-master-class-signup-3
I think this approach does not ensure that the part adds up to the whole, does it?
It does not.
RStudio + R (latest versions) are crashing while in train_model(), step: fit(train). (Win7)
Need more info than this. I recommend filing a stack overflow issue with a minimal reproducible example.
@@BusinessScience The example is your code. I added nothing new to it. Simply wrote it down from the video. Same packages, same data. There is no modification. Since RStudio crashes, there is no real error tracing available too. The only possible difference could be the R environment and OS.
Yes, but your data may be different. Your environment as well. Since I cannot reproduce, I can only speculate that something is amiss with the environment. I’d try installing latest versions of all software, make sure you have Rtools since you are on Windows, and see if there are any errors during software updates.
Several students have had issues with CatBoost, which could also be an issue. The solution was to install the previous version catboost.ai/docs/concepts/r-installation.html
I also note that it requires 64-bit R, which may be incompatible with your OS.