A Data Odyssey
A Data Odyssey
  • Видео 41
  • Просмотров 288 396
Applying Permutation Channel Importance (PCI) to a Remote Sensing Model | Python Tutorial
🚀 Course 🚀
Free: adataodyssey.com/permutation-channel-importance
Paid: adataodyssey.com/courses/xai-for-cv/
We dive into Permutation Channel Importance (PCI) and show you how to apply it using Python. We'll work with the Landsat Irish Coastal Segmentation (LICS) Dataset, a resource for advancing deep learning methods in coastal water body segmentation. This dataset includes 100 multispectral test images, each with a binary segmentation mask that classifies pixels as either land or ocean. This is a great start if you are interested in applying Explainable AI methods to remote sensing machine learning models.
🚀 Useful playlists 🚀
XAI for CV: ruclips.net/p/PLqDyyww9y-1QA4-o4tTAF_iD5cKCC1qEA&si=o...
Просмотров: 143

Видео

Explaining Computer Vision Models with PCI
Просмотров 19321 день назад
🚀 Course 🚀 Free: adataodyssey.com/permutation-channel-importance Paid: adataodyssey.com/courses/xai-for-cv/ In this video, we dive into Permutation Channel Importance (PCI) - a powerful explainability technique that helps identify which color channels (like RGB in regular images) contribute to a machine leanring model's predictions. PCI is a simple yet effective approach within Explainable AI (...
Explaining Anomalies with Isolation Forest and SHAP | Python Tutorial
Просмотров 1,2 тыс.Месяц назад
In this video, we dive deep into the world of anomaly detection with a focus on the Isolation Forest algorithm. Isolation Forest is a powerful machine learning model for identifying outliers in high-dimensional data, but understanding why an anomaly is detected can be a challenge. That's where SHAP (SHapley Additive exPlanations) comes in. We'll explore how to use both KernelSHAP and TreeSHAP t...
SHAP with CatBoostClassifier for Categorical Features | Python Tutorial
Просмотров 9892 месяца назад
Combining CatBoost and SHAP can provide powerful insight into your machine learning models. Especially, when you are working with categorical features. With other modelling packages, we need to first transform categorical features using one-hot encodings. The problem is that each binary variable will have its own SHAP value. This makes it difficult to see the overall contribution of the origina...
Applying LIME with Python | Local & Global Interpretations
Просмотров 9815 месяцев назад
LIME is a popular local explainable AI (XAI) method. It can be used to understand the individual predictions made by a black-box model. We will be applying the method using Python. We will see that although LIME is a local method, we can still aggregate lime weights to get global interpretations of a machine learning model. We do this using feature trends, absolute mean weights and a beeswarm p...
An introduction to LIME for local interpretations | Intuition and Algorithm |
Просмотров 1,3 тыс.5 месяцев назад
LIME is a popular explainable AI (XAI) method. It is known as a local model agnostic method. This means it can be used to explain the individual predictions of any machine learning model. It does this by building simple surrogate models around the black-box model’s prediction for an individual instance. We will: - Explain the algorithm used by LIME to get local interpretations. - Discuss in det...
Friedman's H-statistic Python Tutorial | Artemis Package
Просмотров 7975 месяцев назад
Friedman’s h-statistic, also known as the H-stat or H-index, is a metric used to analyse interactions in a machine learning model. We apply the explainable AI (XAI) method using the artemis Python package. We also explain how to interpret the interaction heatmaps and bar plots. This includes the overall, parwise, normalised and unnormalised h-stat. These can be used to understand the percentage...
Friedman's H-statistic for Analysing Interactions | Maths and Intuition
Просмотров 2,4 тыс.5 месяцев назад
We dive deep into Friedman's H-statistic also known as the H-stat or H-index. This is a popular explainable AI (XAI) method. It is a powerful metric for analyzing interactions between features in your machine learning model. We will: - Build intuition for the method by comparing it to PDPs and ICE Plots. - Explain the mathematics behind the pairwise, overall and unnormalised formulas. - Discuss...
Accumulated Local Effect Plots (ALEs) | Explanation & Python Code
Просмотров 1,1 тыс.6 месяцев назад
Highly correlated features can wreak havoc on your machine-learning model interpretations. To overcome this, we could rely on good feature selection. But there are still cases when a feature, although highly correlated, will provide some unique information leading to a more accurate model. So we need a method that can provide clear interpretations, even with multicollinearity. Thankfully we can...
PDPs and ICE Plots | Python Code | scikit-learn Package
Просмотров 8076 месяцев назад
Both Partial Dependence Plots (PDPs) and Individual Conditional Expectation (ICE) plots are a popular explainable AI (XAI) method. They can visualise the relationships used by a machine learning model to make predictions. In this video, we will see how to apply the methods using Python. We will use the scikit-learn package and the PartialDependenceDisplay & partial_dependence functions. We will...
Partial Dependence (PDPs) and Individual Conditional Expectation (ICE) Plots | Intuition and Math
Просмотров 1,9 тыс.6 месяцев назад
Both Partial Dependence (PDPs) and Individual Conditional Expectation (ICE) Plots are used to understand and explain machine learning models. PDPs can tell us if a relationship between a model feature and target variable is linear, non-linear or if there is no relationship. Similarly, ICE plots are used to visualise interactions. Now, at first glance, these plots may look complicated. But you w...
Permutation Feature Importance from Scratch | Explanation & Python Code
Просмотров 1,3 тыс.7 месяцев назад
Feature importance scores are a collection of methods all used to answer one question: which machine learning model features have contributed the most to predictions in general? Amongst all these methods, permutation feature importance is the most popular. This is due to it’s intuitive calculation and because it can be applied to any machine learning model. Understanding PFI is also an importan...
Model Agnostic Methods for XAI | Global v.s. Local | Permutation v.s. Surrogate Models
Просмотров 6327 месяцев назад
Model agnostic method can be used with any model. In Explainable AI (XAI), this means we can use them to interpret models without looking at their interworkings. This gives us a powerful way to interpret and explain complex black-box machine learning models. We will elaborate on this definition. We will also discuss the taxonomy of model agnostic methods for interpretability. They can be classi...
8 Plots for Explaining Linear Regression | Residuals, Weight, Effect & SHAP
Просмотров 1,1 тыс.7 месяцев назад
For data scientists, a regression summary might be all that's needed to understand a linear model. However, when explaining these models to a non-technical audience, it’s crucial to employ more digestible visual explanations. These 8 methods not only make linear regression more accessible but also enrich your analytical storytelling, making your findings resonate with any audience. We understan...
Feature Selection using Hierarchical Clustering | Python Tutorial
Просмотров 2,7 тыс.7 месяцев назад
In this comprehensive Python tutorial, we delve into feature selection for machine learning with hierarchical clustering. We guide you through the essentials of partitioning features into cohesive groups to minimize redundancy in model training. This technique is particularly important as your dataset expands, offering a structured alternative to manual grouping. What you'll learn: - The import...
8 Characteristics of a Good Machine Learning Feature | Predictive, Variety, Interpretability, Ethics
Просмотров 3897 месяцев назад
8 Characteristics of a Good Machine Learning Feature | Predictive, Variety, Interpretability, Ethics
Interpretable Feature Engineering | How to Build Intuitive Machine Learning Features
Просмотров 6608 месяцев назад
Interpretable Feature Engineering | How to Build Intuitive Machine Learning Features
Modelling Non-linear Relationships with Regression
Просмотров 7348 месяцев назад
Modelling Non-linear Relationships with Regression
Explaining Machine Learning to a Non-technical Audience
Просмотров 8868 месяцев назад
Explaining Machine Learning to a Non-technical Audience
Get more out of Explainable AI (XAI): 10 Tips
Просмотров 1,1 тыс.8 месяцев назад
Get more out of Explainable AI (XAI): 10 Tips
The 6 Benefits of Explainable AI (XAI) | Improve accuracy, decrease harm and tell better stories
Просмотров 1,5 тыс.9 месяцев назад
The 6 Benefits of Explainable AI (XAI) | Improve accuracy, decrease harm and tell better stories
Introduction to Explainable AI (XAI) | Interpretable models, agnostic methods, counterfactuals
Просмотров 6 тыс.9 месяцев назад
Introduction to Explainable AI (XAI) | Interpretable models, agnostic methods, counterfactuals
Data Science vs Science | Differences & Bridging the Gap
Просмотров 312Год назад
Data Science vs Science | Differences & Bridging the Gap
About the Channel and my Background | ML, XAI and Remote Sensing
Просмотров 1,4 тыс.Год назад
About the Channel and my Background | ML, XAI and Remote Sensing
SHAP for Binary and Multiclass Target Variables | Code and Explanations for Classification Problems
Просмотров 13 тыс.Год назад
SHAP for Binary and Multiclass Target Variables | Code and Explanations for Classification Problems
Introduction to Algorithm Fairness | Causes, Measuring & Preventing Unfairness in Machine Learning
Просмотров 2,3 тыс.Год назад
Introduction to Algorithm Fairness | Causes, Measuring & Preventing Unfairness in Machine Learning
SHAP Violin and Heatmap Plots | Interpretations and New Insights
Просмотров 6 тыс.Год назад
SHAP Violin and Heatmap Plots | Interpretations and New Insights
Correcting Unfairness in Machine Learning | Pre-processing, In-processing, Post-processing
Просмотров 1,2 тыс.Год назад
Correcting Unfairness in Machine Learning | Pre-processing, In-processing, Post-processing
Definitions of Fairness in Machine Learning | Equal Opportunity, Equalized Odds & Disparate Impact
Просмотров 4 тыс.Год назад
Definitions of Fairness in Machine Learning | Equal Opportunity, Equalized Odds & Disparate Impact
Exploratory Fairness Analysis | Quantifying Unfairness in Data
Просмотров 1,1 тыс.Год назад
Exploratory Fairness Analysis | Quantifying Unfairness in Data

Комментарии

  • @enthusiast2089
    @enthusiast2089 2 дня назад

    Hi, will there any video on XAI for a fine-tuned LLM on this channel?

  • @xboxlox
    @xboxlox 7 дней назад

    bro you should be in hollywood

    • @adataodyssey
      @adataodyssey 6 дней назад

      Haha will stick to RUclips for now :)

  • @mustafayldz2200
    @mustafayldz2200 7 дней назад

    I cannot use Shap and LDA together, I am experiencing an index error.

    • @mustafayldz2200
      @mustafayldz2200 7 дней назад

      could someone help me

    • @adataodyssey
      @adataodyssey 6 дней назад

      Hi Mustafa, it's not possible to solve your problem based on the information you provided. What package are you using? Are there any other example of where SHAP has been implemented for that package?

  • @JosebaGonzálezdeAlaiza
    @JosebaGonzálezdeAlaiza 7 дней назад

    La diferencia seria: modelo explicable o inexplicable 😅

  • @emirhan2884
    @emirhan2884 10 дней назад

    thank you for the amazing video! so what do you think are the main differences for global interpretations of LIME and SHAP methods? also this is the first time I'm seeing that LIME is used as a global "interpretator". why do you think that LIME is mostly used for local points whereas it can be aggregated just like SHAP? thanks!

    • @adataodyssey
      @adataodyssey 9 дней назад

      Great questions! SHAP is used to estimate Shapley values. So it inherits properties that are useful for understanding and comparing models. See these videos on the theory behind Shapley values for ML: ruclips.net/video/UJeu29wq7d0/видео.html ruclips.net/video/b9qqbFudVhI/видео.html I think LIME is not used in this way for two reasons: 1) it is slower than SHAP (especially TreeSHAP) to get individual feature weights. This can make it time-consuming to aggregate and analyse many instances. 2) There are no built-in functions to create aggregated plots for LIME like you see in the SHAP package. In general, LIME has fallen out of favour due to the solid theory and speed of SHAP values. So perhaps the creators have decided not to develop the method further.

  • @eliiiii98
    @eliiiii98 12 дней назад

    Thank you for the video. Don’t you think it might be better to simply ignore a certain channel and re-run the convolutional network with fewer input channels? This approach could give a more accurate assessment of overall performance compared to changing a channel to complete noise. Respectfully, I find it hard to accept this as a reliable method for evaluating a channel's importance.

    • @adataodyssey
      @adataodyssey 11 дней назад

      Hi Eli, yes I agree that there are better approaches if you want to analyse the "potential" of different channels. You could use metrics like correlation to avoid training a bunch of models (each model would take me about 10 to 16 hours to train). However, these would not tell you which channels the trained model is using to do segmentation. "Importance" is about what is important to a model and not to the problem in general. The method you described could not be used to explain a trained model as you would be evaluating different models. Lastly, PCI can thought of as an extension of permutation feature importance (PFI). This is a commonly accepted method for explaining models built on tabular data. In the same way, correlation or models built on individual features would not provide the same information as PFI scores.

    • @eliiiii98
      @eliiiii98 11 дней назад

      @@adataodyssey Oh i see. Thanks!

  • @kandiahchandrakumaran8521
    @kandiahchandrakumaran8521 12 дней назад

    Excellent videos. Well done. I have 2 questions. (1) Is SHAP unsupervised learing? and (2) Can it be used for time-to event (Survival), where there are censored data, analysis? Many thanks.

    • @adataodyssey
      @adataodyssey 12 дней назад

      Thanks! (1) No, SHAP is not model so it is not supervised or unsupervised learning algorithm. It is a method used to explain a model. (2) I'm not familiar with this usecase. But SHAP can be used whenever you have input variables, a function and output. Where you want to explain the contributions each of the input variables to the output. In the context of predictive machine learning, you can use SHAP to explain the contributions of each model feature to a prediction.

    • @kandiahchandrakumaran8521
      @kandiahchandrakumaran8521 12 дней назад

      @@adataodyssey Thank you for your prompt reply and advice. Best wishes.

  • @majamuster2470
    @majamuster2470 16 дней назад

    Hey great explanation! I have a question: Say I have time series of how many items i sold over 3 years for different items. The items can be sold in multiple stores across the world. My task is to detect an anomaly on the item level (not on the aggregate level). Do I run this isolation forest on each invidual time series and add the store (as a one hot encoded variable) to the feature matrix? Running it individually for each item seems to lose potential information that can be extracted when looking at global patterns across different items. What would you advise in this case? It seems to be a hierarchical time series anomaly detection problem

    • @adataodyssey
      @adataodyssey 12 дней назад

      Hi, good question. You may want to include additional features that capture this global information. For example the average value for one of the features in the month or year before the reading. This will allow you to understand if the current feature value is high/low relative to the average. However, in this case, IsolationForest (IF) may not be the best model. On further investigation, IF cannot model interactions. You would either need to explicitly add a feature that captures the interaction. Like the ratio of the current reading to the average. Or use a model for anomaly detection that does capture interactions like an auto encoder.

  • @maxpain6666
    @maxpain6666 17 дней назад

    thanks

  • @loganstepp2213
    @loganstepp2213 18 дней назад

    Do it with the 2020 election

  • @emirhan2884
    @emirhan2884 19 дней назад

    thank you for the great explanation! though I would've expected the sum of shapley values to be smaller than 1 for a classification problem. am I missing something?

    • @adataodyssey
      @adataodyssey 18 дней назад

      For classification problems, shapley values are interpreted in terms of log odds. See this video for more details: ruclips.net/video/2xlgOu22YgE/видео.html

  • @smartwork7098
    @smartwork7098 19 дней назад

    Thank you so much!

  • @TheSonicPegasus
    @TheSonicPegasus 20 дней назад

    Incredible! Thank you! Are there any other resources or papers on this topic? And Im wondering how it relates to superpixels

    • @adataodyssey
      @adataodyssey 20 дней назад

      You can read my conference paper on the topic: arxiv.org/abs/2405.11500 Permutation with superpixels is more aimed at explaining important regions/ groups of pixels. Like how they are used in SHAP or occlusion.

  • @adataodyssey
    @adataodyssey 20 дней назад

    🚀 Course 🚀 Free: adataodyssey.com/permutation-channel-importance Paid: adataodyssey.com/courses/xai-for-cv/

  • @marykatestimmler9874
    @marykatestimmler9874 26 дней назад

    Thanks!

    • @adataodyssey
      @adataodyssey 26 дней назад

      Wow, my first super thanks. Much appreciated Mary

  • @danishkhan1084
    @danishkhan1084 26 дней назад

    Very informative video! I’m looking forward to the next video on PCI application in satellite imagery classification.

  • @adataodyssey
    @adataodyssey 27 дней назад

    🚀 Course 🚀 Free: adataodyssey.com/permutation-channel-importance Paid: adataodyssey.com/courses/xai-for-cv/

  • @imatlab
    @imatlab 27 дней назад

    Hi I cannot understand negative partial dependency values. What does a negative partial dependency value, such as a change from -0.7 to -0.5 with an increase in input, indicate for a parameter? Does this mean the parameter has a negative relationship with the output, or does it imply that the predictions are generally lower than average? see this for example i.sstatic.net/Jmbs5s2C.png

  • @FP-mg5qk
    @FP-mg5qk Месяц назад

    Hey! Thank you for the video. Just a note: XGBoost now automatically deals with categorical features like Catboost. You just need to pass enable_categorical=True when creating the XGBClassifier!

    • @adataodyssey
      @adataodyssey Месяц назад

      That's great! I guess this is the end of CatBoost then 😅

  • @josephbolton8092
    @josephbolton8092 Месяц назад

    clear as day! Thank you:)

  • @josephbolton8092
    @josephbolton8092 Месяц назад

    This was so great

  • @Daniel-RedTsar
    @Daniel-RedTsar Месяц назад

    could you please link this notebook in the comments or the video description?

    • @adataodyssey
      @adataodyssey Месяц назад

      Here you go: github.com/conorosully/SHAP-tutorial/blob/main/src/shap_tutorial.ipynb

  • @sukhsehajkaur1731
    @sukhsehajkaur1731 Месяц назад

    Hi. Could you please clarify if the model that is given as input to SHAP is built for each possible combination of n features? Say n=3, will the model experiment using several combimations of 3 features and the calculate the values/output value for each case?

    • @adataodyssey
      @adataodyssey Месяц назад

      Hi Sukhsehaj, good question. I discuss the application to ML in this video: ruclips.net/video/b9qqbFudVhI/видео.html To summarise, you train one model with all features. To get the contribution of different feature sets, you "marginalise" over the features that are not in the set.

  • @damodarperumalla9552
    @damodarperumalla9552 Месяц назад

    Thanks for the detailed video. It is really helpful. Can I get the code base which was used in your demo?

    • @adataodyssey
      @adataodyssey Месяц назад

      Sure! You can find it here: github.com/conorosully/SHAP-tutorial/blob/main/src/additional_resources/IsolationForest.ipynb

  • @vic8982
    @vic8982 Месяц назад

    Thanks for being the sole XAI youtuber! as an active learner this is very nice<3

    • @adataodyssey
      @adataodyssey Месяц назад

      No problem Vic! Will be out with some new content soon :)

  • @PabloSanchez-ih2ko
    @PabloSanchez-ih2ko Месяц назад

    Thank you so much for making a video about this topic!

  • @SkowKyubu
    @SkowKyubu Месяц назад

    Thanks a lot for the video. You talked about a robust method to counter multicollinearity but I can't get the name you said. AI ease ?

    • @adataodyssey
      @adataodyssey Месяц назад

      No problem! They are called Accumulated Local Effects (ALE) Plots. I have a video on the topic: ruclips.net/video/5KCA1FMy6U4/видео.html

    • @SkowKyubu
      @SkowKyubu Месяц назад

      Thanks 😁 ​@@adataodyssey

  • @togyjose176
    @togyjose176 Месяц назад

    Hello - this is wonderful content ! May I reference this video in a LinkedIn post giving you credit? If so, can I have your LinkedIn handle?

    • @adataodyssey
      @adataodyssey Месяц назад

      Hi Togy, thanks! Yes, that's no problem at all. My linkedin page is: www.linkedin.com/in/conor-o-sullivan-14b267142/

  • @yaelefrat7677
    @yaelefrat7677 Месяц назад

    Amazing Video! Thank you very much! I do have one question regarding the coalition values of 3 players game: if the value of the prize (first, second, or third) can only be obtain when there is a coalition of 3, then what is the meaning of a coalition of 2 players: C_12, C_13, C_23 ?

    • @adataodyssey
      @adataodyssey Месяц назад

      These are the coalition values (i.e. prize money) if only the 2 players were in a team. To get these, I said "we need to go back in time a few more times". This is so that each alternative team could play the game and win prize money. Obviously, this is not possible. In reality, we must estimate these values. So, the actual prize money of $10000 must be split across the 3 players. However, to find a fair way to divide the money we need to know how much players would have won in alternative teams of 2 or less members.

    • @yaelefrat7677
      @yaelefrat7677 27 дней назад

      @@adataodyssey thank you very much for the detailed answer. the marginal coalition values are calculated in advance right ?

    • @adataodyssey
      @adataodyssey 26 дней назад

      @@yaelefrat7677 They will need to be determined in advanced. For some applications they can be observed or estimated but not calculated. For other applications, like machine learning they are calculated.

  • @yashnarang3014
    @yashnarang3014 Месяц назад

    Great video!!!

  • @adataodyssey
    @adataodyssey Месяц назад

    🚀 SHAP Course 🚀 SHAP course: adataodyssey.com/courses/shap-with-python/ The first 20 people to use the coupon code "CATSHAP24" will get 100% off!

  • @yael123gut
    @yael123gut 2 месяца назад

    It was so clearly and well explained, thank you!

  • @mayuribhandari2224
    @mayuribhandari2224 2 месяца назад

    I have subscribed to newsletter but not getting access to XAI course

    • @adataodyssey
      @adataodyssey 2 месяца назад

      You should receive a coupn code in your mail. Let me know if you don't get it!

  • @Jihaoui
    @Jihaoui 2 месяца назад

    Very good explanation

  • @zahrabounik3390
    @zahrabounik3390 2 месяца назад

    WOW! Such an amazing explanation on SHAP! I really enjoyed. Thank you.

    • @adataodyssey
      @adataodyssey 2 месяца назад

      No problem Zahra! I'm glad you found it useful

  • @myselfandpesit
    @myselfandpesit 2 месяца назад

    Clear explanation. Thanks!

  • @panhtran8384
    @panhtran8384 2 месяца назад

    Thank you!

  • @n0rthern_lights
    @n0rthern_lights 2 месяца назад

    Thank you! You clearly have a talent for explaining complex topics. At my university, it feels like the XAI course is covering almost exclusively topics from your channel! I appreciate your work and hope you find the time and enjoy making these videos! Best, Tim

    • @adataodyssey
      @adataodyssey 2 месяца назад

      Thanks Tim! It's good to hear I am covering relevant topics :)

  • @abc_cba
    @abc_cba 2 месяца назад

    You always bring topics that nobody has heard of or not popular, that's just why I love your content. Best wishes from India.

    • @adataodyssey
      @adataodyssey 2 месяца назад

      Thank you! The hope is that someone out there will find it useful :)

    • @abc_cba
      @abc_cba 2 месяца назад

      @@adataodyssey you're a superhero without a cape.

    • @Daniel-RedTsar
      @Daniel-RedTsar Месяц назад

      @@adataodyssey this is my the topic of my Master's so you're really helping lol

    • @adataodyssey
      @adataodyssey Месяц назад

      @@Daniel-RedTsar I'm glad I coud help. Good luck with the masters!

  • @QindeelZahra
    @QindeelZahra 2 месяца назад

    Thanks for the video really useful, i am facing issue in understanding how you calculated the value for Player (P2) in your companion article. A little explanation on the weights would help there as i am struggling to get the value $3750. Thanks

    • @adataodyssey
      @adataodyssey 2 месяца назад

      Embrase the struggle! Will you will learn more through trying to solve it :)

  • @escolhifazeremcasa
    @escolhifazeremcasa 2 месяца назад

    Is there any way to deal with limitation 2: Feature Dependencies ?

    • @adataodyssey
      @adataodyssey 2 месяца назад

      Often, even if you have highly correlated features, SHAP will still work. It is just important to keep in mind that it may have problems if you do have highly correlatated features. In this case, you just need to confirm the results from shap using a method that is robust to them like ALEs or simple data exploration methods.

  • @civilengineeringonlinecour7143
    @civilengineeringonlinecour7143 2 месяца назад

    Awesome lecture

    • @adataodyssey
      @adataodyssey 2 месяца назад

      @@civilengineeringonlinecour7143 thanks glad it could help!

    • @adataodyssey
      @adataodyssey 2 месяца назад

      @@civilengineeringonlinecour7143 I’m glad you found it useful!

  • @adeauy2294
    @adeauy2294 2 месяца назад

    Nice video! the plots will be different for keras model right? i follow your codes but it seems that it wont work for neural network model tho.

    • @adataodyssey
      @adataodyssey 2 месяца назад

      @@adeauy2294 The plots should be the same if you train a NN on tabular data. However, I’ve had a lot of trouble trying to get the package to work with PyTorch. I’m not sure about Keras but I expect you are having similar problems.

  • @brenoingwersen784
    @brenoingwersen784 2 месяца назад

    For categorical features @3:35 wouldn't it make sense to just create a full pipeline in which all raw features are preprocessed (scaled, encoded, etc) and run through the model to generate predictions and afterwards calculating the shap values? This way you have the categorical feature contribution in an interpretable way...

    • @adataodyssey
      @adataodyssey 2 месяца назад

      @@brenoingwersen784 The problem is if you have a categorical feature with many categories (say 10), you will have 10 dummy features after encoding. This means you will have 10 SHAP values for the categorical feature making it difficult to understand the overall effect of that feature. You can solve this by adding the SHAP values for each dummy feature or using catboost.

  • @cauchyschwarz3295
    @cauchyschwarz3295 2 месяца назад

    What I don’t understand is that, in the game video, the value of a coalition is calculated by re-running the game with each coalition. Here it would seem to me that finding the value of a feature coalition would mean re-training the model for each coalition. That doesn’t just seem expensive, it seems downright prohibitive for complex or large models. You started with a fixed regression model where the weights are already determined, but for a e.g. a neural network model leaving out features could change the weights significantly right?

    • @adataodyssey
      @adataodyssey 2 месяца назад

      @@cauchyschwarz3295 This is a bit confusing! But no, you will only have one model (the one you want to explain). You marginalise the prediction function (I.e. model) over the features that are not in set S. You do not have to retrain a new model by excluding the features that are not in S.

    • @PoisinInAPrettyPill
      @PoisinInAPrettyPill Месяц назад

      I think there's a subtlety here in the application. In the theoretical model the goal is to calculate a fair payout for an individual feature based on how much it contributes to building a good model. Good models are measured by how well they predict things. So, we would want to think about how much the feature contributes to a reduction in model error, and you would want to train a model with and without a given feature to figure this out. (But instead we use other methods for determining which features are good that are easier to compute.) In the use case in this video (and what is the norm in ML), they are using Shapley to explain how a feature contributed to the model prediction, irregardless of how good the model actually is. This is helpful because Shapley still has desireable properties like additivity in this use case. If you trained a new model without this feature, it wouldn't answer the question of how much the feature contributed to the prediction value in the old model.