SHAP for Binary and Multiclass Target Variables | Code and Explanations for Classification Problems

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024

Комментарии • 28

  • @adataodyssey
    @adataodyssey  8 месяцев назад

    *NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
    SHAP course: adataodyssey.com/courses/shap-with-python/
    XAI course: adataodyssey.com/courses/xai-with-python/
    Newsletter signup: mailchi.mp/40909011987b/signup

    • @otabeknajimov9697
      @otabeknajimov9697 7 месяцев назад

      it showing this coupon has expired I didn't sign up for both yet

    • @adataodyssey
      @adataodyssey  7 месяцев назад +1

      @@otabeknajimov9697 Hi Otabek, unfortunately the SHAP course is no longer free. But if you sing up to my newsletter you will get a coupon for a different XAI course.

  • @ifenchen8788
    @ifenchen8788 9 месяцев назад

    Great video! You explain the intermediate calculation process in SHAP very clearly!

    • @adataodyssey
      @adataodyssey  9 месяцев назад

      Thanks! I'm glad you enjoyed it

  • @sahil5124
    @sahil5124 9 месяцев назад

    So good, keep bringing more explainable ai content

    • @adataodyssey
      @adataodyssey  9 месяцев назад

      Thanks Sahil! Planning to start publishing videos more regularly starting in Feb

  • @johannesfrank4153
    @johannesfrank4153 2 месяца назад

    Thank you for your explanation. I have a question about the aggregation of SHAP values, specifically about selecting the "correct" SHAP plot. We want to show the SHAP plot for which the corresponding prediction probability is the highest. Then, we overwrite the SHAP values for this instance. This results in one SHAP value for each instance and feature. However, you didn't mention the base values for the classes. Aren't these essential for calculating f(x)? How do we choose and save the correct base value?

    • @adataodyssey
      @adataodyssey  2 месяца назад

      The base values are available from the shap object:
      shap_values_cat.base_values
      You will see there are 3 unique values - one for every class. These are the average predicted odds across all instances for that class.
      You can find more information on the parts of the shap values object in this article:
      towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19

  • @solomonamankwahobiriyeboah2665

    Great tutorial! Very useful! Good explanations!

  • @TheCsePower
    @TheCsePower Год назад

    I can hear you're from South Africa! How do you intepret shap values for categorical features? When it's Male Female, it's easy, but what if we have 15 categories? Really love the quality content on this channel!

    • @adataodyssey
      @adataodyssey  Год назад

      Yes, I'm originally from Cape Town!
      You have two options. Either use Catboost or sum the individual SHAP values of each one-hot encoding. I wrote these articles on the topic a while ago (no-paywall links):
      towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
      towardsdatascience.com/shap-for-categorical-features-with-catboost-8315e14dac1?sk=ef720159150a19b111d8740ab0bbac6d

  • @mohammadshahadathossain1544
    @mohammadshahadathossain1544 2 месяца назад

    I have singed up to the newsletter, but cannot get the free access of XAI course

    • @adataodyssey
      @adataodyssey  2 месяца назад

      Hi Mohammad, you should have recieved an email with a coupon for 100% off.

  • @danielsanchez-gomez566
    @danielsanchez-gomez566 4 месяца назад

    Excellent video. I have a concern:
    I'm not quite sure about the interpretation of negative values in softmax. Isn't softmax supposed to return values between 0 and 1?

    • @adataodyssey
      @adataodyssey  4 месяца назад +1

      I see how the wording is confusing! They are kind of like the softmax version of logodds. You need to apply softmax to those values to get probabilities.
      This article might help:
      medium.com/towards-data-science/shap-for-binary-and-multiclass-target-variables-ff2f43de0cf4?sk=f23afbb01aa2f552d5df8c7ac6efbde0

  • @Daniel-Erbesfeld
    @Daniel-Erbesfeld 6 дней назад

    could you please link this notebook in the comments or the video description?

    • @adataodyssey
      @adataodyssey  6 дней назад

      Here you go: github.com/conorosully/SHAP-tutorial/blob/main/src/shap_tutorial.ipynb

  • @UidamLee
    @UidamLee Год назад

    Great video thanks. I have one question: at 4:25 waterfall plot, I did understand that you can use the record's probability to calculate f(x).
    But what if I want to interpret the bar plot? (The average of absolute shap values) How should I interpret the shap value? (like 1 unit of field X increases the probability of Y about n%)

    • @adataodyssey
      @adataodyssey  Год назад +1

      Good question! You can interpret each bar as "the feature changes the log odds of a postive prediction by X on average when compared to the average log odds" where X is the height of the bar.
      Keep in mind that SHAP values are not parameters. i.e. we can not use them to understand how a prediction will change when we increase the feature value by 1 unit. They simiply tell us the contribution of a feature to a prediction in the context of the other feature values.
      I hope that makes sense.They wording can be a bit tricky! If it still unclear, see time 2:00 to 3:30 in this video:
      ruclips.net/video/MQ6fFDwjuco/видео.html&ab_channel=ADataOdyssey

    • @UidamLee
      @UidamLee Год назад

      @@adataodyssey thanks for the explanation :) I get it now. BTW my waterfall plot always shows the f(x) = 1 or 0 in the logistic model. Then, is it automatically linked from log odds to probability? In that case, can I interpret that as the average probability (%p) increase?
      and I have one more question, how can I determine the direction (+/-) of the mean absolute shap values? I see they indicate magnitudes, but wanna know if there are ways I can find out the signs. Because from beeswarm plot, the relationship of certain variable looks somewhat positive but if I actually calculate the average of shap values, it is below 0 or something. So I wonder if there are some ways to get it. Again, thanks for your great video :)

    • @adataodyssey
      @adataodyssey  Год назад

      1) I think with logistic regression, SHAP will default to using the linear explainer. I have personally never worked with this and so I am not sure how it affects the interpretations. I think as you are dealing with a linear model the SHAP values will be related to the parameter of the model.
      2) I recommend signing up for the course ;) It goes into detail on how you can explore the SHAP values and create your own custom plots. Similar to what you want to do here! Otherwise please see the article below. It explains the SHAP values object in more detail:
      towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19

  • @anki8136
    @anki8136 Год назад

    Hi buddy
    I learnt shap from you but I am facing some issues, I am trying to plot any graph then I am getting a common error,
    Error- "The beeswarm/waterfall plot requires an explanation object as the shap values argument "
    Cab you please help me buddy.
    Thanks

    • @adataodyssey
      @adataodyssey  Год назад

      It is not possible to debug your code from that comment. Can you past your code and the actual error message?

    • @anki8136
      @anki8136 Год назад +1

      The actual error is given below.
      Error- "The beeswarm/waterfall plot requires an explanation object as the shap values arguments.
      Whenever I am trying to plot something like waterfall beeswarm I am getting this error.
      I can't post my code.

    • @adataodyssey
      @adataodyssey  Год назад +1

      @@anki8136 Not sure I can be much help then. It sounds like you are not passing in a valid explanation object. For example, you get the "shap_values" is an explainer object:
      #Get shap values
      explainer = shap.Explainer(model)
      shap_values = explainer(X)
      You can sense this by printing out the values:
      print(shap_values.values)
      This should have dimensions equal to (#instances, #features) in your X feature matrix.
      Try to run the code in this tutorial if you are still having problems:
      towardsdatascience.com/introduction-to-shap-with-python-d27edc23c454?sk=01c06f166e742e2084d581e40bf0b96e

    • @anki8136
      @anki8136 Год назад

      @@adataodyssey thanks