Your "messy/crowdy" slides (your words, not mine) work extremely well - to break down and explain all the building blocks (even the simpler ones which give the brain some pause) in this pace is a very effective (Feynman style) way to teach - super!
Thank you, very well explained. I got how the nodes will be split for a categorical features since the choices are discrete but for a continuous feature, with a range [1,10] how did we arrive at 5.5 in your example? From the plot it is very easy to understand visually but how does the algorithm arrive at no 5.5? Did the algorithm split x1 at all possible values in the range [1,10], calculate the entropy or perhaps does it start with a random guess and go from there (may be median as a starting point)? also how does the algorithm choose step size for the continuous feature to evaluate, e..g did it calculate the entropy with a step size of 0.5? why not 0.05? I don't remember specifying the step size in sklearn for evaluation. Also, if I have 100 features, it will evaluate the entropy for all 100 features at various ranges/discrete values for the parent node, repeat the process for the child nodes and keeps going on until stopping criteria is met, correct? Sounds very computationally intensive but in reality it's very fast even for very large datasets.
Good questions! To divide up the continuous feature, there is a lot of code involved. I think it's mainly like you said: finding the median of the sorted feature and go from there (github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_splitter.pyx). Look for the BestSplitter class. Unfortunately, I don't know the interval size. " Also, if I have 100 features, it will evaluate the entropy for all 100 features at various ranges/discrete values for the parent node, repeat the process for the child nodes and keeps going on until stopping criteria is met, correct" --> yes, that's right! Even with binary splits, this is already very expensive.
@@SebastianRaschka Thank you for taking time to answer. You mention homework assignment in the lecture, is that available somewhere or is that only for students of UW Madison?
Your "messy/crowdy" slides (your words, not mine) work extremely well - to break down and explain all the building blocks (even the simpler ones which give the brain some pause) in this pace is a very effective (Feynman style) way to teach - super!
Thank you, very well explained. I got how the nodes will be split for a categorical features since the choices are discrete but for a continuous feature, with a range [1,10] how did we arrive at 5.5 in your example? From the plot it is very easy to understand visually but how does the algorithm arrive at no 5.5? Did the algorithm split x1 at all possible values in the range [1,10], calculate the entropy or perhaps does it start with a random guess and go from there (may be median as a starting point)? also how does the algorithm choose step size for the continuous feature to evaluate, e..g did it calculate the entropy with a step size of 0.5? why not 0.05? I don't remember specifying the step size in sklearn for evaluation. Also, if I have 100 features, it will evaluate the entropy for all 100 features at various ranges/discrete values for the parent node, repeat the process for the child nodes and keeps going on until stopping criteria is met, correct? Sounds very computationally intensive but in reality it's very fast even for very large datasets.
Good questions! To divide up the continuous feature, there is a lot of code involved. I think it's mainly like you said: finding the median of the sorted feature and go from there (github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_splitter.pyx). Look for the BestSplitter class. Unfortunately, I don't know the interval size. " Also, if I have 100 features, it will evaluate the entropy for all 100 features at various ranges/discrete values for the parent node, repeat the process for the child nodes and keeps going on until stopping criteria is met, correct" --> yes, that's right! Even with binary splits, this is already very expensive.
@@SebastianRaschka Thank you for taking time to answer. You mention homework assignment in the lecture, is that available somewhere or is that only for students of UW Madison?
@@SandeepPawar1 It's not available yet. I will share it on GitHub (prob. next week after the midterm exam)