The Most Important Integral in Data Science

Поделиться
HTML-код
  • Опубликовано: 30 сен 2024

Комментарии • 13

  • @hannahnelson4569
    @hannahnelson4569 3 месяца назад +1

    No this does not make sense. You do not want a high FPR'. This is because it is impossible to have a generally high FPR' since the integral of FPR' from 0 to 1 is FPR(t) which has a change of exactly one. The concept of a generally higher FPR'(t) does not make sense.
    The other bit about the high TPR(t) is fine. But the FPR' part does not track.
    In addition, relating the FPR' for the two models AUC seemed to just blatantly be wrong? It is possible for two models with exactly the same FPR(t) to have different AUC values if their TPR(t) are different. So the differential equation described cannot be true.
    Perhaps I am missing something, but this explanation of AUC seems to get the calculus wrong. I am dubious of any conclusions drawn from it.

  • @tantzer6113
    @tantzer6113 4 месяца назад +1

    I am wondering about binary classification when there are multiple variables/parameters.

    • @falkstankat6511
      @falkstankat6511 4 месяца назад

      TPR and FPR are Independent from the variables. Its a Metric calculated as Probability from the Output.

  • @user-km4pi4pc5b
    @user-km4pi4pc5b 4 месяца назад

    Hello, I simplified the integral further and found the result delta R^2, which shows that the value of the integral is not dependent on the curve, what is the implication of this?

  • @ccuuttww
    @ccuuttww 4 месяца назад +2

    However most of the integration in ML can only estimate by sampling

    • @buumschakalaka4425
      @buumschakalaka4425 4 месяца назад

      Can you give an example where integration is needed in the ML context where sampling is needed? I'd like to check that out and learn abou it 💪

    • @sailingintosunshine
      @sailingintosunshine 4 месяца назад +2

      for example any time you want to compute an expected value (which mathematically is an integral), but which doesn‘t have a convenient, practical or feasible solution. You can then instead sample the term in the expected value with the probability distribution of the random variable and thus use theMonte Carlo method to approximate the integral reformulated as an expectation.

    • @sailingintosunshine
      @sailingintosunshine 4 месяца назад +2

      a more specific example to look at is the variational autoencoder

    • @ccuuttww
      @ccuuttww 4 месяца назад +1

      It is marginal likelihood mostly if u want to study that kind of topics u can start with Gibbs sampling or Gaussian Processing

  • @djangoworldwide7925
    @djangoworldwide7925 4 месяца назад

    I would argue that the integral of the normal distribution is more fundamental for ds..

  • @CC21200
    @CC21200 4 месяца назад

    I'm sure ROCs have their uses, but I think their usefulness in medicine is usually overrated.

    • @iSJ9y217
      @iSJ9y217 4 месяца назад

      May I ask what make you think so? We work with it in medical applications too, but the only downside I noticed, is that ROC-AUC is not a good metrics for cross-validation for training classificators on imbalanced data sets, because in this case ROC-AUC gives overly optimistic estimation on training data. Except this AUC looks fine..

  • @atifdai313
    @atifdai313 3 месяца назад

    Great work..............................easy way of teaching......