Hey guys I hope you enjoyed the video! If you did please subscribe to the channel! Join our Data Science Discord Here: discord.com/invite/F7dxbvHUhg If you want to watch a full course on Machine Learning check out Datacamp: datacamp.pxf.io/XYD7Qg Want to solve Python data interview questions: stratascratch.com/?via=ryan I'm also open to freelance data projects. Hit me up at ryannolandata@gmail.com *Both Datacamp and Stratascratch are affiliate links.
can't see the right part of the code some terms are not understandable, please provide the github link else make your recording screen larger btw great videos
@@RyanAndMattDataScience Hi, actually I am facing issue regarding no space left on device while tuning for prophet model like below OSError: [Errno 28] No space left on device: '/tmp/tmp_m3laolf/prophet_model7a78vylx' There is some internal tmp folder which is filling up, can you help a little if possible ?
I enjoyed your video, but it looks as if you're not understanding the point of the `random_state` parameter. It allows for reproducibility. If you put in a value 'x' for the `random_state` you'll always get the same output. That's useful for testing.
sir i don't know why you didn't share the code after making the learning project .Please share sir. q1) At the end Of the video you didn't get the results. So how is the production code really looks like? Do you do any kind of hyperparameter tuning? Or you go on the basis of your knowledge of the different parameters and the intuitions that you have in order to have a much better hyperparameter tuning? So could you please share your knowledge with respect to the production level code? Sir.
Thank you for honest sharing of results! Could it be that train_test_split accidentally created split with unbalanced target? Another reason of getting worse OOS result I can think about is optimizing for mean CV scores without getting variance into account. And the 3rd one I suspect is missing sensitivity study. Like, we found the peak on the train set, but mb in its vicinity it had only valleys or even cliffs (we averaged across data splits but not across neighbor parameters)? And the last option is simple absense of early stopping: the last model can simply be overfit one. Going to recreate your example and find out )
@@RyanAndMattDataScience My conclusion is that what we observed is due to underlying estimator (RandomForest) intrinsic random behavior and small dataset size (under 300 records). Apparently cv=5 is not enough to compensate for that. I wrote a function to fit a forest (without random seed) ntimes on your exact train/test split, and it seems that only after ntimes=50 average scores stop jumping back and forth more than 1 percent. So, while all I said above might hold (I checked only train/test data distributions, there is a skew in histograms, but not a terrible one), solutions for this demo could be: 1) as a quick solution for the demo, using fixed random seed of the estimator inside objective() and final function, too (not only for the reference model). you did a good job specifying random seed for the train_test_split and the first model, I always keep forgetting that. But you still missed providing seeds to the cv (implicit None was used) and the model inside of the objective func, so there was no full reproducibility unfortunately. 2) more realistic for production. using, say, cv=ShuffleSplit(test_size=0.5,n_splits=50). Takes longer but you get much less risk of getting worse pointwise OOS estimates like in the video :-) A few more notes. 1) Tuning n_estimators for the forests has little sense as higher values will always give better (at least, not worse) scores. Adding n_estimators to the tuning set can make sense if you adjust pure ML scores by the runtime though. 2) Choosing RandomSampler for the 1st demo is weird, as default TPE sampler is the one who's intelligent, what's the reason for using something better than sklearn's RandomSearch in a first place. I would expect Random Vs TPE to be always worse, on avg. 3) if you are not gonna modify found best params, instantiating like final_model=RandomForestRegressor(**study.best_params) is a cleaner approach and you won't forget any params
@@RyanAndMattDataScience It's funny. I repeated optimization with your seeds and sampler, but used cv=ShuffleSplit(test_size=0.5,n_splits=50). It took 30 mins but params {'n_estimators': 940, 'max_depth': 45, 'min_samples_split': 3, 'min_samples_leaf': 1} still won! So the histograms skew between train/test remains the most probable reason now, others 2 not excluding (cliffs/no smoothness in the obj function near the best params, and no early stopping).
It might be that the random forest models are so well studied that some hyper parameters are just know to work well most of the times. I'm sure someone has studied that in depth.
Hey guys I hope you enjoyed the video! If you did please subscribe to the channel!
Join our Data Science Discord Here: discord.com/invite/F7dxbvHUhg
If you want to watch a full course on Machine Learning check out Datacamp: datacamp.pxf.io/XYD7Qg
Want to solve Python data interview questions: stratascratch.com/?via=ryan
I'm also open to freelance data projects. Hit me up at ryannolandata@gmail.com
*Both Datacamp and Stratascratch are affiliate links.
can't see the right part of the code some terms are not understandable, please provide the github link else make your recording screen larger
btw great videos
Thanks for the feedback. I plan on doing a bulk upload of video files to GitHub in the near future!
Want to grab the code? I have an article here: ryannolandata.com/optuna-hyperparameter-tuning/
I found about Optuna while working on a Kaggle Competition. This video will help me a lot in Kaggle Competitions. Thanks a lot Ryan 👍💯
No problem, that’s where I found it also
you mean you just copy pasted someone else's code in the code tab right? stop the BS
@@richardgibson1872 ? I have two screens and prep the code for each video
@@RyanAndMattDataScience Hi, actually I am facing issue regarding no space left on device while tuning for prophet model like below
OSError: [Errno 28] No space left on device: '/tmp/tmp_m3laolf/prophet_model7a78vylx'
There is some internal tmp folder which is filling up, can you help a little if possible ?
I enjoyed your video, but it looks as if you're not understanding the point of the `random_state` parameter. It allows for reproducibility. If you put in a value 'x' for the `random_state` you'll always get the same output. That's useful for testing.
I gotta give optuna a try I usually just use a gridsearchcv or randomsearchcv for hyper parameter tuning
That’s what I used in the past until I discovered optuna
one of a kind
.
Thanks
sir i don't know why you didn't share the code after making the learning project .Please share sir.
q1) At the end Of the video you didn't get the results. So how is the production code really looks like? Do you do any kind of hyperparameter tuning? Or you go on the basis of your knowledge of the different parameters and the intuitions that you have in order to have a much better hyperparameter tuning?
So could you please share your knowledge with respect to the production level code? Sir.
Thank you for honest sharing of results! Could it be that train_test_split accidentally created split with unbalanced target? Another reason of getting worse OOS result I can think about is optimizing for mean CV scores without getting variance into account. And the 3rd one I suspect is missing sensitivity study. Like, we found the peak on the train set, but mb in its vicinity it had only valleys or even cliffs (we averaged across data splits but not across neighbor parameters)? And the last option is simple absense of early stopping: the last model can simply be overfit one. Going to recreate your example and find out )
Let me know I’m definitely curious. Didn’t really look into it too much as I wanted to share more of how to use optuna than to build a great model.
@@RyanAndMattDataScience My conclusion is that what we observed is due to underlying estimator (RandomForest) intrinsic random behavior and small dataset size (under 300 records). Apparently cv=5 is not enough to compensate for that. I wrote a function to fit a forest (without random seed) ntimes on your exact train/test split, and it seems that only after ntimes=50 average scores stop jumping back and forth more than 1 percent. So, while all I said above might hold (I checked only train/test data distributions, there is a skew in histograms, but not a terrible one), solutions for this demo could be: 1) as a quick solution for the demo, using fixed random seed of the estimator inside objective() and final function, too (not only for the reference model). you did a good job specifying random seed for the train_test_split and the first model, I always keep forgetting that. But you still missed providing seeds to the cv (implicit None was used) and the model inside of the objective func, so there was no full reproducibility unfortunately. 2) more realistic for production. using, say, cv=ShuffleSplit(test_size=0.5,n_splits=50). Takes longer but you get much less risk of getting worse pointwise OOS estimates like in the video :-)
A few more notes.
1) Tuning n_estimators for the forests has little sense as higher values will always give better (at least, not worse) scores. Adding n_estimators to the tuning set can make sense if you adjust pure ML scores by the runtime though.
2) Choosing RandomSampler for the 1st demo is weird, as default TPE sampler is the one who's intelligent, what's the reason for using something better than sklearn's RandomSearch in a first place. I would expect Random Vs TPE to be always worse, on avg.
3) if you are not gonna modify found best params, instantiating like final_model=RandomForestRegressor(**study.best_params) is a cleaner approach and you won't forget any params
@@RyanAndMattDataScience It's funny. I repeated optimization with your seeds and sampler, but used cv=ShuffleSplit(test_size=0.5,n_splits=50). It took 30 mins but params {'n_estimators': 940, 'max_depth': 45, 'min_samples_split': 3, 'min_samples_leaf': 1} still won! So the histograms skew between train/test remains the most probable reason now, others 2 not excluding (cliffs/no smoothness in the obj function near the best params, and no early stopping).
It might be that the random forest models are so well studied that some hyper parameters are just know to work well most of the times. I'm sure someone has studied that in depth.
where can i get the code files
I plan on doing a dump on GitHub of all my files in the future so stay tuned!
Well presented but the performance is poorer than your baseline, correct?
Correct
Great video! Could explain more about the hyperparameter importance here and what kind of insights can you learn from it.
Thanks and not sure if I did on my other hyper parameter tuning video
Good Work. Why Optuna is better that gridsearchcv or randomsearchcv?
It tries to find the exact best answer. Use whatever works best for you but in Kaggle comps a lot of people use optuna
@@RyanAndMattDataScienceIF I want to upload my ml project on play store or website , which one will be the best ?
Great content thanks
Thank you
Hey man following you since a while.. Big fan!
Thank you! I have a new project video coming out soon. But behind for next weeks videos
XGBoost is working better
Nice
you should zoom more
I do in newer videos
@@RyanAndMattDataScience it was just irony :d it would be better if you could zoom out 👍
tnx a lot
No problem
Zero performance
😂😂