TbExplain: A Text-based Explanation Method for Scene Classification Models with the Statistical Prediction Correction
- URL: http://arxiv.org/abs/2307.10003v2
- Date: Mon, 8 Jul 2024 09:40:03 GMT
- Title: TbExplain: A Text-based Explanation Method for Scene Classification Models with the Statistical Prediction Correction
- Authors: Amirhossein Aminimehr, Pouya Khani, Amirali Molaei, Amirmohammad Kazemeini, Erik Cambria,
- Abstract summary: We propose a framework called TbExplain that employs XAI techniques and a pre-trained object detector to present text-based explanations of scene classification models.
TbExplain incorporates a novel method to correct predictions and textually explain them based on the statistics of objects in the input image when the initial prediction is unreliable.
- Score: 23.78984414404192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The field of Explainable Artificial Intelligence (XAI) aims to improve the interpretability of black-box machine learning models. Building a heatmap based on the importance value of input features is a popular method for explaining the underlying functions of such models in producing their predictions. Heatmaps are almost understandable to humans, yet they are not without flaws. Non-expert users, for example, may not fully understand the logic of heatmaps (the logic in which relevant pixels to the model's prediction are highlighted with different intensities or colors). Additionally, objects and regions of the input image that are relevant to the model prediction are frequently not entirely differentiated by heatmaps. In this paper, we propose a framework called TbExplain that employs XAI techniques and a pre-trained object detector to present text-based explanations of scene classification models. Moreover, TbExplain incorporates a novel method to correct predictions and textually explain them based on the statistics of objects in the input image when the initial prediction is unreliable. To assess the trustworthiness and validity of the text-based explanations, we conducted a qualitative experiment, and the findings indicated that these explanations are sufficiently reliable. Furthermore, our quantitative and qualitative experiments on TbExplain with scene classification datasets reveal an improvement in classification accuracy over ResNet variants.
Related papers
- Interpretable Machine Learning for Weather and Climate Prediction: A Survey [24.028385794099435]
We review current interpretable machine learning approaches applied to meteorological predictions.
Design inherently interpretable models from scratch using architectures like tree ensembles and explainable neural networks.
We discuss research challenges around achieving deeper mechanistic interpretations aligned with physical principles.
arXiv Detail & Related papers (2024-03-24T14:23:35Z) - Faithful and Robust Local Interpretability for Textual Predictions [6.492879435794228]
We propose FRED (Faithful and Robust Explainer for textual Documents), a novel method for interpreting predictions over text.
FRED offers three key insights to explain a model prediction: (1) it identifies the minimal set of words in a document whose removal has the strongest influence on the prediction, (2) it assigns an importance score to each token, reflecting its influence on the model's output, and (3) it provides counterfactual explanations.
arXiv Detail & Related papers (2023-10-30T20:27:36Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - Explanation Method for Anomaly Detection on Mixed Numerical and
Categorical Spaces [0.9543943371833464]
We present EADMNC (Explainable Anomaly Detection on Mixed Numerical and Categorical spaces)
It adds explainability to the predictions obtained with the original model.
We report experimental results on extensive real-world data, particularly in the domain of network intrusion detection.
arXiv Detail & Related papers (2022-09-09T08:20:13Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - Robust Semantic Interpretability: Revisiting Concept Activation Vectors [0.0]
Interpretability methods for image classification attempt to expose whether the model is systematically biased or attending to the same cues as a human would.
Our proposed Robust Concept Activation Vectors (RCAV) quantifies the effects of semantic concepts on individual model predictions and on model behavior as a whole.
arXiv Detail & Related papers (2021-04-06T20:14:59Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Do Input Gradients Highlight Discriminative Features? [42.47346844105727]
Interpretability methods seek to explain instance-specific model predictions.
We introduce an evaluation framework to study this hypothesis for benchmark image classification tasks.
We make two surprising observations on CIFAR-10 and Imagenet-10 datasets.
arXiv Detail & Related papers (2021-02-25T11:04:38Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.