6.1 Intro to Decision Trees (L06: Decision Trees)

Поделиться
HTML-код
  • Опубликовано: 23 дек 2024

Комментарии • 5

  • @mariia.morello
    @mariia.morello Год назад

    Hi Sebastian, thanks for the video, it would also be great if you could have shown the features you were working with or just how the data looked like after introducing the problem you were trying to solve

  • @sinanstat
    @sinanstat 4 года назад

    21:33 This is a great example. But why not cluster analysis or logistic regression or Q-factor analysis (it's not famous anyway)?
    One advantage I see here is we don't have to worry about the assumptions, because it is nonparametric. The other advantage might be it's super fast.
    My exact question is what is the selling point of Decision Trees or Random Forest?

    • @SebastianRaschka
      @SebastianRaschka  4 года назад +10

      We didn't do cluster analysis because we had class label in formation so it was a supervised problem. Logistic regression was one of the models we tried though. The logistic regression coefficients (and also doing logistic regression with sequential feature selection) led to the same conclusion as the decision tree (this slide shows the logistic regression + feat sele results: speakerdeck.com/rasbt/building-hypothesis-driven-virtual-screening-pipelines-for-millions-of-molecules-at-odsc-west-2017?slide=44). The Decision tree was easy to explain to non-machine learning people (my collaborators), which was nice.
      -> selling point of Decision Trees or Random Forest?
      Decision tree: it's easy to understand the model behavior for short trees. Random forest: good performance ou of the box without hyperparameter tuning.

  • @hikmetsezen155
    @hikmetsezen155 4 года назад

    Hi Sebastian, on the published graph of your decision tree about chemical molecule, some of leaf nodes have non-zero gini values. Does it mean that the decision tree is truncated? Thanks!

    • @SebastianRaschka
      @SebastianRaschka  4 года назад +2

      Yes, I think this was a truncated tree. However, you can also have non-zero gini values even if you don't truncate. This happens if you have examples that have exactly the same feature values but different target values.