Shapley Values for Machine Learning

Поделиться
HTML-код
  • Опубликовано: 10 янв 2025

Комментарии •

  • @adataodyssey
    @adataodyssey  11 месяцев назад

    *NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
    SHAP course: adataodyssey.com/courses/shap-with-python/
    XAI course: adataodyssey.com/courses/xai-with-python/
    Newsletter signup: mailchi.mp/40909011987b/signup

  • @OfficialSnowsix
    @OfficialSnowsix Год назад +6

    These are some of the best data science tutorials Ive seen on RUclips. Don't give up, keep making it. I know you'll make it big =)

    • @adataodyssey
      @adataodyssey  Год назад +2

      Thank you so much! I really appreciate the support :)

  • @lleger
    @lleger 4 месяца назад

    starting an internship in september where the goal is to explain with shap alot of models, this channel is a gold mine

    • @adataodyssey
      @adataodyssey  4 месяца назад

      Hi Louis, I am glad I could help. Good luck for your internship!

  • @ericafontana4020
    @ericafontana4020 Год назад +2

    nice explanation! :)

  • @NeverHadMakingsOfAVarsityAthle

    at 5:27 you mention the formula for calculating valx(S). don't we also need to subtract EX(f(X)) from that?

    • @adataodyssey
      @adataodyssey  Год назад +1

      You can but they will cancel out when you subtract val(S) from val(S U {i})

    • @NeverHadMakingsOfAVarsityAthle
      @NeverHadMakingsOfAVarsityAthle Год назад

      Aaaaah of course, that makes sense! Thanks, you helped me a lot - not only with this comment but the entire video series:)

    • @adataodyssey
      @adataodyssey  Год назад +1

      @@NeverHadMakingsOfAVarsityAthle No problem Matthias! I'm glad I could help :)

  • @felixgrisoni3807
    @felixgrisoni3807 5 месяцев назад

    Hi, your video is very well summarized but there is an error in the formula 1 must be excluded in the (Val S U {1})

    • @adataodyssey
      @adataodyssey  5 месяцев назад

      I don't see the mistake. At what time period do I introduce (Val S U{1})?
      {1} referes to a coalition of feature 1 i.e. x1 and not the feature's values

  • @cauchyschwarz3295
    @cauchyschwarz3295 4 месяца назад

    What I don’t understand is that, in the game video, the value of a coalition is calculated by re-running the game with each coalition. Here it would seem to me that finding the value of a feature coalition would mean re-training the model for each coalition. That doesn’t just seem expensive, it seems downright prohibitive for complex or large models. You started with a fixed regression model where the weights are already determined, but for a e.g. a neural network model leaving out features could change the weights significantly right?

    • @adataodyssey
      @adataodyssey  4 месяца назад

      @@cauchyschwarz3295 This is a bit confusing! But no, you will only have one model (the one you want to explain). You marginalise the prediction function (I.e. model) over the features that are not in set S. You do not have to retrain a new model by excluding the features that are not in S.

    • @PoisinInAPrettyPill
      @PoisinInAPrettyPill 3 месяца назад

      I think there's a subtlety here in the application. In the theoretical model the goal is to calculate a fair payout for an individual feature based on how much it contributes to building a good model. Good models are measured by how well they predict things. So, we would want to think about how much the feature contributes to a reduction in model error, and you would want to train a model with and without a given feature to figure this out. (But instead we use other methods for determining which features are good that are easier to compute.)
      In the use case in this video (and what is the norm in ML), they are using Shapley to explain how a feature contributed to the model prediction, irregardless of how good the model actually is. This is helpful because Shapley still has desireable properties like additivity in this use case. If you trained a new model without this feature, it wouldn't answer the question of how much the feature contributed to the prediction value in the old model.

  • @AvijeetTulsiani
    @AvijeetTulsiani Год назад

    Can you share link of Previous video which explains Shapley Formula?

    • @adataodyssey
      @adataodyssey  Год назад

      Sure, Avijeet! You can find all the videos in this playlist: SHAP
      ruclips.net/p/PLqDyyww9y-1SJgMw92x90qPYpHgahDLIK

  • @chrisleenatra
    @chrisleenatra Год назад

    Nice video.
    But i've a question, what you showed in the video are if we are trying to "exclude" a categorical column (degree)
    What about continuous column (number column)? (the age)
    What value would we use?

    • @adataodyssey
      @adataodyssey  Год назад

      Thanks! If the continuous variable is in the coalition, then we use the actual value for that instance (i.e. the person's actual age). If the continuous variable is not in the coalition, then we integrate over the values of the variable w.r.t. to the probability of the values.
      However, in practice, we will not know the probability distribution of a variable. So we will have to randomly sample different values for the variable from our dataset. We do this a bunch of times so we end up approximating the distribution.
      I hope that makes sense? There is a lot of statistical theory that underlies this explanation!

    • @chrisleenatra
      @chrisleenatra Год назад

      @@adataodyssey Ahh I see, got it.
      But out of curiosity, Can you give me the reference of those statistical theory?

    • @adataodyssey
      @adataodyssey  Год назад +1

      @@chrisleenatra Unfortunatley, I don't have any specific references. I'm using the knowledge from back in my undergrad. If you want to understand take a look at "stochastic calculus"

  • @PENUification
    @PENUification 6 месяцев назад

    First you say that it's efficient. One minute later you say that is very computational expsensive?

    • @adataodyssey
      @adataodyssey  6 месяцев назад +1

      Yes, that's correct. The efficiency property of Shapley values doesn't have anything to do with computational efficiency. It just means that if you add the Shapley values of an instance and the mean prediction across all instances, you will get the prediction for that instance.