*NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course) SHAP course: adataodyssey.com/courses/shap-with-python/ XAI course: adataodyssey.com/courses/xai-with-python/ Newsletter signup: mailchi.mp/40909011987b/signup
@@otabeknajimov9697 Hi Otabek, unfortunately the SHAP course is no longer free. But if you sing up to my newsletter you will get a coupon for a different XAI course.
Thank you for your explanation. I have a question about the aggregation of SHAP values, specifically about selecting the "correct" SHAP plot. We want to show the SHAP plot for which the corresponding prediction probability is the highest. Then, we overwrite the SHAP values for this instance. This results in one SHAP value for each instance and feature. However, you didn't mention the base values for the classes. Aren't these essential for calculating f(x)? How do we choose and save the correct base value?
The base values are available from the shap object: shap_values_cat.base_values You will see there are 3 unique values - one for every class. These are the average predicted odds across all instances for that class. You can find more information on the parts of the shap values object in this article: towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
I can hear you're from South Africa! How do you intepret shap values for categorical features? When it's Male Female, it's easy, but what if we have 15 categories? Really love the quality content on this channel!
Yes, I'm originally from Cape Town! You have two options. Either use Catboost or sum the individual SHAP values of each one-hot encoding. I wrote these articles on the topic a while ago (no-paywall links): towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19 towardsdatascience.com/shap-for-categorical-features-with-catboost-8315e14dac1?sk=ef720159150a19b111d8740ab0bbac6d
Excellent video. I have a concern: I'm not quite sure about the interpretation of negative values in softmax. Isn't softmax supposed to return values between 0 and 1?
I see how the wording is confusing! They are kind of like the softmax version of logodds. You need to apply softmax to those values to get probabilities. This article might help: medium.com/towards-data-science/shap-for-binary-and-multiclass-target-variables-ff2f43de0cf4?sk=f23afbb01aa2f552d5df8c7ac6efbde0
Great video thanks. I have one question: at 4:25 waterfall plot, I did understand that you can use the record's probability to calculate f(x). But what if I want to interpret the bar plot? (The average of absolute shap values) How should I interpret the shap value? (like 1 unit of field X increases the probability of Y about n%)
Good question! You can interpret each bar as "the feature changes the log odds of a postive prediction by X on average when compared to the average log odds" where X is the height of the bar. Keep in mind that SHAP values are not parameters. i.e. we can not use them to understand how a prediction will change when we increase the feature value by 1 unit. They simiply tell us the contribution of a feature to a prediction in the context of the other feature values. I hope that makes sense.They wording can be a bit tricky! If it still unclear, see time 2:00 to 3:30 in this video: ruclips.net/video/MQ6fFDwjuco/видео.html&ab_channel=ADataOdyssey
@@adataodyssey thanks for the explanation :) I get it now. BTW my waterfall plot always shows the f(x) = 1 or 0 in the logistic model. Then, is it automatically linked from log odds to probability? In that case, can I interpret that as the average probability (%p) increase? and I have one more question, how can I determine the direction (+/-) of the mean absolute shap values? I see they indicate magnitudes, but wanna know if there are ways I can find out the signs. Because from beeswarm plot, the relationship of certain variable looks somewhat positive but if I actually calculate the average of shap values, it is below 0 or something. So I wonder if there are some ways to get it. Again, thanks for your great video :)
1) I think with logistic regression, SHAP will default to using the linear explainer. I have personally never worked with this and so I am not sure how it affects the interpretations. I think as you are dealing with a linear model the SHAP values will be related to the parameter of the model. 2) I recommend signing up for the course ;) It goes into detail on how you can explore the SHAP values and create your own custom plots. Similar to what you want to do here! Otherwise please see the article below. It explains the SHAP values object in more detail: towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
Hi buddy I learnt shap from you but I am facing some issues, I am trying to plot any graph then I am getting a common error, Error- "The beeswarm/waterfall plot requires an explanation object as the shap values argument " Cab you please help me buddy. Thanks
The actual error is given below. Error- "The beeswarm/waterfall plot requires an explanation object as the shap values arguments. Whenever I am trying to plot something like waterfall beeswarm I am getting this error. I can't post my code.
@@anki8136 Not sure I can be much help then. It sounds like you are not passing in a valid explanation object. For example, you get the "shap_values" is an explainer object: #Get shap values explainer = shap.Explainer(model) shap_values = explainer(X) You can sense this by printing out the values: print(shap_values.values) This should have dimensions equal to (#instances, #features) in your X feature matrix. Try to run the code in this tutorial if you are still having problems: towardsdatascience.com/introduction-to-shap-with-python-d27edc23c454?sk=01c06f166e742e2084d581e40bf0b96e
*NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
SHAP course: adataodyssey.com/courses/shap-with-python/
XAI course: adataodyssey.com/courses/xai-with-python/
Newsletter signup: mailchi.mp/40909011987b/signup
it showing this coupon has expired I didn't sign up for both yet
@@otabeknajimov9697 Hi Otabek, unfortunately the SHAP course is no longer free. But if you sing up to my newsletter you will get a coupon for a different XAI course.
Great video! You explain the intermediate calculation process in SHAP very clearly!
Thanks! I'm glad you enjoyed it
So good, keep bringing more explainable ai content
Thanks Sahil! Planning to start publishing videos more regularly starting in Feb
Thank you for your explanation. I have a question about the aggregation of SHAP values, specifically about selecting the "correct" SHAP plot. We want to show the SHAP plot for which the corresponding prediction probability is the highest. Then, we overwrite the SHAP values for this instance. This results in one SHAP value for each instance and feature. However, you didn't mention the base values for the classes. Aren't these essential for calculating f(x)? How do we choose and save the correct base value?
The base values are available from the shap object:
shap_values_cat.base_values
You will see there are 3 unique values - one for every class. These are the average predicted odds across all instances for that class.
You can find more information on the parts of the shap values object in this article:
towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
Great tutorial! Very useful! Good explanations!
No problem Solomon!
I can hear you're from South Africa! How do you intepret shap values for categorical features? When it's Male Female, it's easy, but what if we have 15 categories? Really love the quality content on this channel!
Yes, I'm originally from Cape Town!
You have two options. Either use Catboost or sum the individual SHAP values of each one-hot encoding. I wrote these articles on the topic a while ago (no-paywall links):
towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
towardsdatascience.com/shap-for-categorical-features-with-catboost-8315e14dac1?sk=ef720159150a19b111d8740ab0bbac6d
I have singed up to the newsletter, but cannot get the free access of XAI course
Hi Mohammad, you should have recieved an email with a coupon for 100% off.
Excellent video. I have a concern:
I'm not quite sure about the interpretation of negative values in softmax. Isn't softmax supposed to return values between 0 and 1?
I see how the wording is confusing! They are kind of like the softmax version of logodds. You need to apply softmax to those values to get probabilities.
This article might help:
medium.com/towards-data-science/shap-for-binary-and-multiclass-target-variables-ff2f43de0cf4?sk=f23afbb01aa2f552d5df8c7ac6efbde0
could you please link this notebook in the comments or the video description?
Here you go: github.com/conorosully/SHAP-tutorial/blob/main/src/shap_tutorial.ipynb
Great video thanks. I have one question: at 4:25 waterfall plot, I did understand that you can use the record's probability to calculate f(x).
But what if I want to interpret the bar plot? (The average of absolute shap values) How should I interpret the shap value? (like 1 unit of field X increases the probability of Y about n%)
Good question! You can interpret each bar as "the feature changes the log odds of a postive prediction by X on average when compared to the average log odds" where X is the height of the bar.
Keep in mind that SHAP values are not parameters. i.e. we can not use them to understand how a prediction will change when we increase the feature value by 1 unit. They simiply tell us the contribution of a feature to a prediction in the context of the other feature values.
I hope that makes sense.They wording can be a bit tricky! If it still unclear, see time 2:00 to 3:30 in this video:
ruclips.net/video/MQ6fFDwjuco/видео.html&ab_channel=ADataOdyssey
@@adataodyssey thanks for the explanation :) I get it now. BTW my waterfall plot always shows the f(x) = 1 or 0 in the logistic model. Then, is it automatically linked from log odds to probability? In that case, can I interpret that as the average probability (%p) increase?
and I have one more question, how can I determine the direction (+/-) of the mean absolute shap values? I see they indicate magnitudes, but wanna know if there are ways I can find out the signs. Because from beeswarm plot, the relationship of certain variable looks somewhat positive but if I actually calculate the average of shap values, it is below 0 or something. So I wonder if there are some ways to get it. Again, thanks for your great video :)
1) I think with logistic regression, SHAP will default to using the linear explainer. I have personally never worked with this and so I am not sure how it affects the interpretations. I think as you are dealing with a linear model the SHAP values will be related to the parameter of the model.
2) I recommend signing up for the course ;) It goes into detail on how you can explore the SHAP values and create your own custom plots. Similar to what you want to do here! Otherwise please see the article below. It explains the SHAP values object in more detail:
towardsdatascience.com/shap-for-categorical-features-7c63e6a554ea?sk=2eca9ff9d28d1c8bfde82f6784bdba19
Hi buddy
I learnt shap from you but I am facing some issues, I am trying to plot any graph then I am getting a common error,
Error- "The beeswarm/waterfall plot requires an explanation object as the shap values argument "
Cab you please help me buddy.
Thanks
It is not possible to debug your code from that comment. Can you past your code and the actual error message?
The actual error is given below.
Error- "The beeswarm/waterfall plot requires an explanation object as the shap values arguments.
Whenever I am trying to plot something like waterfall beeswarm I am getting this error.
I can't post my code.
@@anki8136 Not sure I can be much help then. It sounds like you are not passing in a valid explanation object. For example, you get the "shap_values" is an explainer object:
#Get shap values
explainer = shap.Explainer(model)
shap_values = explainer(X)
You can sense this by printing out the values:
print(shap_values.values)
This should have dimensions equal to (#instances, #features) in your X feature matrix.
Try to run the code in this tutorial if you are still having problems:
towardsdatascience.com/introduction-to-shap-with-python-d27edc23c454?sk=01c06f166e742e2084d581e40bf0b96e
@@adataodyssey thanks