ConLUX: Concept-Based Local Unified Explanations
- URL: http://arxiv.org/abs/2410.12439v1
- Date: Wed, 16 Oct 2024 10:34:11 GMT
- Title: ConLUX: Concept-Based Local Unified Explanations
- Authors: Junhao Liu, Haonan Yu, Xin Zhang,
- Abstract summary: toolname is a general framework to provide concept-based local explanations for any machine learning models.
We have instantiated toolname on four different types of explanation techniques: LIME, Kernel SHAP, Anchor, and LORE.
Our evaluation results demonstrate that toolname offers more faithful explanations and makes them more understandable to users.
- Score: 14.110188927768736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid advancements of various machine learning models, there is a significant demand for model-agnostic explanation techniques, which can explain these models across different architectures. Mainstream model-agnostic explanation techniques generate local explanations based on basic features (e.g., words for text models and (super-)pixels for image models). However, these explanations often do not align with the decision-making processes of the target models and end-users, resulting in explanations that are unfaithful and difficult for users to understand. On the other hand, concept-based techniques provide explanations based on high-level features (e.g., topics for text models and objects for image models), but most are model-specific or require additional pre-defined external concept knowledge. To address this limitation, we propose \toolname, a general framework to provide concept-based local explanations for any machine learning models. Our key insight is that we can automatically extract high-level concepts from large pre-trained models, and uniformly extend existing local model-agnostic techniques to provide unified concept-based explanations. We have instantiated \toolname on four different types of explanation techniques: LIME, Kernel SHAP, Anchor, and LORE, and applied these techniques to text and image models. Our evaluation results demonstrate that 1) compared to the vanilla versions, \toolname offers more faithful explanations and makes them more understandable to users, and 2) by offering multiple forms of explanations, \toolname outperforms state-of-the-art concept-based explanation techniques specifically designed for text and image models, respectively.
Related papers
- EPIC: Explanation of Pretrained Image Classification Networks via Prototype [6.619564359642066]
EPIC (Explanation of Pretrained Image Classification) is a novel approach that bridges the gap between ante-hoc and post-hoc explanations.<n>It delivers intuitive, prototype-based explanations inspired by ante-hoc techniques.<n>We evaluate EPIC on benchmark datasets commonly used in prototype-based explanations.
arXiv Detail & Related papers (2025-05-19T09:32:20Z) - Language Model as Visual Explainer [72.88137795439407]
We present a systematic approach for interpreting vision models using a tree-structured linguistic explanation.
Our method provides human-understandable explanations in the form of attribute-laden trees.
To access the effectiveness of our approach, we introduce new benchmarks and conduct rigorous evaluations.
arXiv Detail & Related papers (2024-12-08T20:46:23Z) - Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations [7.736445799116692]
Concept-based methods have become a popular choice for explaining deep neural networks post-hoc.<n>We devise a reinforcement learning-based preference optimization algorithm that fine-tunes a vision-language generative model.<n>We demonstrate our method's ability to efficiently and reliably articulate diverse concepts.
arXiv Detail & Related papers (2024-08-24T02:26:42Z) - Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models [51.21351775178525]
DiffExplainer is a novel framework that, leveraging language-vision models, enables multimodal global explainability.
It employs diffusion models conditioned on optimized text prompts, synthesizing images that maximize class outputs.
The analysis of generated visual descriptions allows for automatic identification of biases and spurious features.
arXiv Detail & Related papers (2024-04-03T10:11:22Z) - An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity.
We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z) - RTQ: Rethinking Video-language Understanding Based on Image-text Model [55.278942477715084]
Video-language understanding presents unique challenges due to the inclusion of highly complex semantic details.
We propose a novel framework called RTQ, which addresses these challenges simultaneously.
Our model demonstrates outstanding performance even in the absence of video-language pre-training.
arXiv Detail & Related papers (2023-12-01T04:51:01Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Interpreting Vision and Language Generative Models with Semantic Visual
Priors [3.3772986620114374]
We develop a framework based on SHAP that allows for generating meaningful explanations leveraging the meaning representation of the output sequence as a whole.
We demonstrate that our method generates semantically more expressive explanations than traditional methods at a lower compute cost.
arXiv Detail & Related papers (2023-04-28T17:10:08Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - OCTET: Object-aware Counterfactual Explanations [29.532969342297086]
We propose an object-centric framework for counterfactual explanation generation.
Our method, inspired by recent generative modeling works, encodes the query image into a latent space that is structured to ease object-level manipulations.
We conduct a set of experiments on counterfactual explanation benchmarks for driving scenes, and we show that our method can be adapted beyond classification.
arXiv Detail & Related papers (2022-11-22T16:23:12Z) - ELUDE: Generating interpretable explanations via a decomposition into
labelled and unlabelled features [23.384134043048807]
We develop an explanation framework that decomposes a model's prediction into two parts.
By identifying the latter, we are able to analyze the "unexplained" portion of the model.
We show that the set of unlabelled features can generalize to multiple models trained with the same feature space.
arXiv Detail & Related papers (2022-06-15T17:36:55Z) - STEEX: Steering Counterfactual Explanations with Semantics [28.771471624014065]
Deep learning models are increasingly used in safety-critical applications.
For simple images, such as low-resolution face portraits, visual counterfactual explanations has recently been proposed.
We propose a new generative counterfactual explanation framework that produces plausible and sparse modifications.
arXiv Detail & Related papers (2021-11-17T13:20:29Z) - Explanation as a process: user-centric construction of multi-level and
multi-modal explanations [0.34410212782758043]
We present a process-based approach that combines multi-level and multi-modal explanations.
We use Inductive Logic Programming, an interpretable machine learning approach, to learn a comprehensible model.
arXiv Detail & Related papers (2021-10-07T19:26:21Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.