I find that your video has a perfect mix of being easily understood/interpretable and containing enough nuance/depth for those who are interested in going below surface-level knowledge. Thank you!
Good video, could you please create a video on how to use shapely value on multiclass classification, interpreting model at global level and as well as at instance level.
5:14 OMG! I've been reading those shapley value papers and I am always wondering what it means for "the expected output of the model" when all feature values are NaN. Thank you!
Hey i just have one question in that sample_feature_importance method you have passed idx and when you call that method you pass 0 what will happen of we pass other value, what does that idx doing
Thank you for this awesome video! I have a question - do you know if any of these techniques could be used for deep neural networks that involve embedding layers?
Amazing explanation on Shapley value! To figure out which features are the most important, how does Shapley value compare with Permutation Importance? Which one is more accurate ?
I just stumbled across your channel and it’s really amazing how detailed your videos are, it’s rare to find someone who will go in more depth on RUclips. I just subscribed, and also does your channel have a discord group? I enjoy discussing these topics on discord, plus it’s easier to share channels and support them via discord when the RUclipsr who’s video I’m sharing also has a discord server of their own.
Thank you so much! I am planning to create a discord real soon just for this. Good call! Trying to get my channel out there as much as possible, so any attention helps :) Thanks for the support
I have a problem when I tried to use foce_plot for multiple Samples. "NotImplementedError: matplotlib = True is not yet supported for force plots with multiple samples!". Can you help me?
in evaluate function, why is the sklearn_pandas DataFrameMapper only doing transform on the data and not fit_transform? I've never thought about this before but if you're not fitting the scaler to the data wouldn't just using transform do nothing? Or is it just fitting and transforming, but essentially throwing away the fit so you couldn't use it again to transform another set of data with the same parameters also why are you adding a column of one's with sm.add_constant? Thanks! for reference: def evaluate(X, y, mapper=None, reg=None, transform=False): if transform: X = mapper.transform(X) X = sm.add_constant(X, has_constant='add') y_pred = reg.predict(X) return mean_absolute_error(y, y_pred)
The reason why the shap values don't exactly align with the PDP is the interaction effect between features that it factored in when calculating a feature's contribution. I think the method for SHAP value calculation shown here is an oversimplified (to the extent of being something else entirely) version. Maybe you can make a follow up video to dive deeper.
I suspect that non-linear models would be harder to interpret in a similar fashion because I can imagine for certain parameters their values relationships may reverse or change completely.
If mean absolute error is used instead of RMSE as cost function in linear regression, then how to differentiate the mean absolute error function to find the minimum value of the function. The video should have touched upon the other method used to minimise the mean absolute error function.
Hi, can I ask for your help? I am trying to plot ri index for my lstm model. It's constantly throwing me an error index 2 is out of bounds, and I am unsure how to deal with the situation.
Just to be clear, each partial dependency plot is y=m*x + c where c = 22.7675 and m varies with feature value (eg 2.9649 for RAD and 0.3042 for AGE). These values were obtained from reg.summary(). Is my understanding correct?
New here and I love your channel! I have my upcoming interviews on ML with time-series, though I had only learned traditional methods for time serise. Any resources that would be helpful?
Sorry I'm getting to this a lil late. Thank you! I have a couple of videos on Time series analysis via a regression approach that might be worth looking into :)
I find that your video has a perfect mix of being easily understood/interpretable and containing enough nuance/depth for those who are interested in going below surface-level knowledge. Thank you!
Great video! I love the explanation style of walking through notebook code with plots etc.
Thanks a ton! Appreciate the compliments. Mind sharing this video with people tou think would like this content? That would really help tooo. :)
You are the best explainer, you don't stay in the extremes of being too superficial or too mathematical.
Thanks so much! Hope you enjoy the other videos on this channel
I love all your videos! They deserve all the algorithm pushes in the world.
Thanks so much for the kind words!
Great video! Would be excited to see a video on LIME in the future. Another algorithm for interpretability.
What a fantastic explanation! Thanks for your work.
Hi! Thank you! More fun stuff coming soon!
I love your videos! thanks for explaining DS concepts so clearly :)!
Anytime. My pleasure :)
I discovered SHAP not too long time ago. I freaking love it, one of the best things out there
Agreed. Love it
THIS IS BRILLIANT AND SOOOO HELPFUL, was brought here by coding tech, immediate sub
Awesome! Welcome! :)
Good job explaining SHAP
Is the SHAP value equivalent to the partial R2 in a linear regression?
Good video, could you please create a video on how to use shapely value on multiclass classification, interpreting model at global level and as well as at instance level.
Sure. Perhaps a future video. Thanks for watching!!
5:14 OMG! I've been reading those shapley value papers and I am always wondering what it means for "the expected output of the model" when all feature values are NaN. Thank you!
Anytime :) the concepts can be fun and tricky
In practice what is the difference between SHAP and ALE?
Hey i just have one question in that sample_feature_importance method you have passed idx and when you call that method you pass 0 what will happen of we pass other value, what does that idx doing
What a great video! Thank You.
Thank you for this awesome video! I have a question - do you know if any of these techniques could be used for deep neural networks that involve embedding layers?
Awesome explanation. One Request : Can you please do a video on 'SHAP on NLP BERT model' ?
Amazing explanation on Shapley value! To figure out which features are the most important, how does Shapley value compare with Permutation Importance? Which one is more accurate ?
I just stumbled across your channel and it’s really amazing how detailed your videos are, it’s rare to find someone who will go in more depth on RUclips. I just subscribed, and also does your channel have a discord group? I enjoy discussing these topics on discord, plus it’s easier to share channels and support them via discord when the RUclipsr who’s video I’m sharing also has a discord server of their own.
Thank you so much! I am planning to create a discord real soon just for this. Good call! Trying to get my channel out there as much as possible, so any attention helps :) Thanks for the support
Nice explanation
Such a great video💛💛💛
Thanks so much for watching !
I have a problem when I tried to use foce_plot for multiple Samples. "NotImplementedError: matplotlib = True is not yet supported for force plots with multiple samples!". Can you help me?
Helpful video🌹
I cannot use SHAP and LDA at the same time, an index error occurs. Can someone who knows help me?
Great knowledge !
in evaluate function, why is the sklearn_pandas DataFrameMapper only doing transform on the data and not fit_transform? I've never thought about this before but if you're not fitting the scaler to the data wouldn't just using transform do nothing? Or is it just fitting and transforming, but essentially throwing away the fit so you couldn't use it again to transform another set of data with the same parameters
also why are you adding a column of one's with sm.add_constant?
Thanks! for reference:
def evaluate(X, y, mapper=None, reg=None, transform=False):
if transform:
X = mapper.transform(X)
X = sm.add_constant(X, has_constant='add')
y_pred = reg.predict(X)
return mean_absolute_error(y, y_pred)
The reason why the shap values don't exactly align with the PDP is the interaction effect between features that it factored in when calculating a feature's contribution. I think the method for SHAP value calculation shown here is an oversimplified (to the extent of being something else entirely) version. Maybe you can make a follow up video to dive deeper.
Great video! Would it be possible to do a video on the Shapley-Owen decomposition for regression models?
I suspect that non-linear models would be harder to interpret in a similar fashion because I can imagine for certain parameters their values relationships may reverse or change completely.
If mean absolute error is used instead of RMSE as cost function in linear regression, then how to differentiate the mean absolute error function to find the minimum value of the function. The video should have touched upon the other method used to minimise the mean absolute error function.
Great!
Hi, can I ask for your help? I am trying to plot ri index for my lstm model. It's constantly throwing me an error index 2 is out of bounds, and I am unsure how to deal with the situation.
Hey man, did you ever find a solution?
Just to be clear, each partial dependency plot is y=m*x + c where c = 22.7675 and m varies with feature value (eg 2.9649 for RAD and 0.3042 for AGE). These values were obtained from reg.summary(). Is my understanding correct?
Nicely explained 👍
Thanks a ton!
Great video
Thank you! :)
so so so underrated
Thank you for the compliment! Please share around if ya can :)
@@CodeEmporium already done.
New here and I love your channel! I have my upcoming interviews on ML with time-series, though I had only learned traditional methods for time serise. Any resources that would be helpful?
Sorry I'm getting to this a lil late. Thank you! I have a couple of videos on Time series analysis via a regression approach that might be worth looking into :)
Only talk about easy part, the core concepts seems like you don’t even know, explain all obvious parts, wasted my time