The XAISuite framework and the implications of explanatory system
dissonance
- URL: http://arxiv.org/abs/2304.08499v1
- Date: Sat, 15 Apr 2023 04:40:03 GMT
- Title: The XAISuite framework and the implications of explanatory system
dissonance
- Authors: Shreyan Mitra and Leilani Gilpin
- Abstract summary: This paper compares two explanatory systems, SHAP and LIME, based on the correlation of their respective importance scores.
The magnitude of importance is not significant in explanation consistency.
The similarity between SHAP and LIME importance scores cannot predict model accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Explanatory systems make machine learning models more transparent. However,
they are often inconsistent. In order to quantify and isolate possible
scenarios leading to this discrepancy, this paper compares two explanatory
systems, SHAP and LIME, based on the correlation of their respective importance
scores using 14 machine learning models (7 regression and 7 classification) and
4 tabular datasets (2 regression and 2 classification). We make two novel
findings. Firstly, the magnitude of importance is not significant in
explanation consistency. The correlations between SHAP and LIME importance
scores for the most important features may or may not be more variable than the
correlation between SHAP and LIME importance scores averaged across all
features. Secondly, the similarity between SHAP and LIME importance scores
cannot predict model accuracy. In the process of our research, we construct an
open-source library, XAISuite, that unifies the process of training and
explaining models. Finally, this paper contributes a generalized framework to
better explain machine learning models and optimize their performance.
Related papers
- Rashomon effect in Educational Research: Why More is Better Than One for Measuring the Importance of the Variables? [0.0]
The study uses the Rashomon set of simple-yet-accurate models trained using decision trees, random forests, light GBM, and XGBoost algorithms.
We found that the Rashomon set improves the predictive accuracy by 2-6%.
Key demographic variables imd_band and highest_education were identified as vital, but their importance varied across courses.
arXiv Detail & Related papers (2024-12-02T14:05:36Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - Using Explainable Boosting Machine to Compare Idiographic and Nomothetic
Approaches for Ecological Momentary Assessment Data [2.0824228840987447]
This paper explores the use of non-linear interpretable machine learning (ML) models in classification problems.
Various ensembles of trees are compared to linear models using imbalanced synthetic and real-world datasets.
In one of the two real-world datasets, knowledge distillation method achieves improved AUC scores.
arXiv Detail & Related papers (2022-04-04T17:56:37Z) - Deep Learning Models for Knowledge Tracing: Review and Empirical
Evaluation [2.423547527175807]
We review and evaluate a body of deep learning knowledge tracing (DLKT) models with openly available and widely-used data sets.
The evaluated DLKT models have been reimplemented for assessing and replicability of previously reported results.
arXiv Detail & Related papers (2021-12-30T14:19:27Z) - Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs)
We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Ensemble Learning-Based Approach for Improving Generalization Capability
of Machine Reading Comprehension Systems [0.7614628596146599]
Machine Reading (MRC) is an active field in natural language processing with many successful developed models in recent years.
Despite their high in-distribution accuracy, these models suffer from two issues: high training cost and low out-of-distribution accuracy.
In this paper, we investigate the effect of ensemble learning approach to improve generalization of MRC systems without retraining a big model.
arXiv Detail & Related papers (2021-07-01T11:11:17Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Towards a More Reliable Interpretation of Machine Learning Outputs for
Safety-Critical Systems using Feature Importance Fusion [0.0]
We introduce a novel fusion metric and compare it to the state-of-the-art.
Our approach is tested on synthetic data, where the ground truth is known.
Results show that our feature importance ensemble Framework overall produces 15% less feature importance error compared to existing methods.
arXiv Detail & Related papers (2020-09-11T15:51:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.