Wow, I stumbled across your lectures & specifically had to look for it again, since I think you do a fantastic job of explaining things and now I find out that you are behind mlxtend. I LOVE the sequential feature selector you created. It literally is a game changer in some of my projects. Big fan here!
The insight into what models are currently popular in the industry is really helpful! I've been using logistic regression for almost everything because I like seeing the probabilities but it sounds like I should be looking into Gradient Boosting and XGBoosting. I've been reading your Python Machine Learning book in preparation for graduate school this fall and it's helped me a lot. Thank you!
Logistic regression is a great baseline model. However, yeah, it has the limitation that it is restricted to linear decision boundary, so I would definitely also consider Random Forests as another baseline (requires basically 0 tuning) and then gradient boosting to improve computational performance. Btw. you can also get probabilities from tree-based ensemble methods. You may want to calibrate them though: scikit-learn.org/stable/modules/calibration.html
Wow, I stumbled across your lectures & specifically had to look for it again, since I think you do a fantastic job of explaining things and now I find out that you are behind mlxtend. I LOVE the sequential feature selector you created. It literally is a game changer in some of my projects. Big fan here!
The insight into what models are currently popular in the industry is really helpful! I've been using logistic regression for almost everything because I like seeing the probabilities but it sounds like I should be looking into Gradient Boosting and XGBoosting.
I've been reading your Python Machine Learning book in preparation for graduate school this fall and it's helped me a lot. Thank you!
Logistic regression is a great baseline model. However, yeah, it has the limitation that it is restricted to linear decision boundary, so I would definitely also consider Random Forests as another baseline (requires basically 0 tuning) and then gradient boosting to improve computational performance. Btw. you can also get probabilities from tree-based ensemble methods. You may want to calibrate them though: scikit-learn.org/stable/modules/calibration.html
@@SebastianRaschka Thank you so much!