Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
- URL: http://arxiv.org/abs/2511.07974v1
- Date: Wed, 12 Nov 2025 01:31:56 GMT
- Title: Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
- Authors: Lintong Zhang, Kang Yin, Seong-Whan Lee,
- Abstract summary: We propose a fine-grained counterfactual explanation framework that generates both object-level and part-level interpretability.<n>Our approach yields explainable counterfactuals in a non-generative manner by quantifying similarity and weighting component contributions.
- Score: 50.68751788132789
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Attribution-based explanation techniques capture key patterns to enhance visual interpretability; however, these patterns often lack the granularity needed for insight in fine-grained tasks, particularly in cases of model misclassification, where explanations may be insufficiently detailed. To address this limitation, we propose a fine-grained counterfactual explanation framework that generates both object-level and part-level interpretability, addressing two fundamental questions: (1) which fine-grained features contribute to model misclassification, and (2) where dominant local features influence counterfactual adjustments. Our approach yields explainable counterfactuals in a non-generative manner by quantifying similarity and weighting component contributions within regions of interest between correctly classified and misclassified samples. Furthermore, we introduce a saliency partition module grounded in Shapley value contributions, isolating features with region-specific relevance. Extensive experiments demonstrate the superiority of our approach in capturing more granular, intuitively meaningful regions, surpassing fine-grained methods.
Related papers
- Superclass-Guided Representation Disentanglement for Spurious Correlation Mitigation [16.791911185682256]
We propose a method that leverages the semantic structure inherent in class labels to reduce reliance on spurious features.<n>Our model employs gradient-based attention guided by a pre-trained vision-language model to disentangle superclass-relevant and irrelevant features.<n>Our approach achieves robustness to more complex spurious correlations without the need to annotate any source samples.
arXiv Detail & Related papers (2025-08-12T02:16:04Z) - LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance [54.683384204063934]
Large multi-modal models (LMMs) struggle with inaccurate segmentation and hallucinated comprehension.<n>We propose LIRA, a framework that capitalizes on the complementary relationship between visual comprehension and segmentation.<n>LIRA achieves state-of-the-art performance in both segmentation and comprehension tasks.
arXiv Detail & Related papers (2025-07-08T07:46:26Z) - Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability [1.8274323268621635]
Real Explainer (RealExp) is an interpretability method that decouples the Shapley Value into individual feature importance and feature correlation importance.<n>RealExp enhances interpretability by precisely quantifying both individual feature contributions and their interactions.
arXiv Detail & Related papers (2024-12-02T10:50:50Z) - Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios.<n>Existing SHGL methods encounter two significant limitations.<n>We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Explaining, Evaluating and Enhancing Neural Networks' Learned
Representations [2.1485350418225244]
We show how explainability can be an aid, rather than an obstacle, towards better and more efficient representations.
We employ such attributions to define two novel scores for evaluating the informativeness and the disentanglement of latent embeddings.
We show that adopting our proposed scores as constraints during the training of a representation learning task improves the downstream performance of the model.
arXiv Detail & Related papers (2022-02-18T19:00:01Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Variational Disentanglement for Domain Generalization [68.85458536180437]
We propose to tackle the problem of domain generalization by delivering an effective framework named Variational Disentanglement Network (VDN)
VDN is capable of disentangling the domain-specific features and task-specific features, where the task-specific features are expected to be better generalized to unseen but related test data.
arXiv Detail & Related papers (2021-09-13T09:55:32Z) - Explaining a Series of Models by Propagating Local Feature Attributions [9.66840768820136]
Pipelines involving several machine learning models improve performance in many domains but are difficult to understand.
We introduce a framework to propagate local feature attributions through complex pipelines of models based on a connection to the Shapley value.
Our framework enables us to draw higher-level conclusions based on groups of gene expression features for Alzheimer's and breast cancer histologic grade prediction.
arXiv Detail & Related papers (2021-04-30T22:20:58Z) - Refining Neural Networks with Compositional Explanations [31.84868477264624]
We propose to refine a learned model by collecting human-provided compositional explanations on the models' failure cases.
We demonstrate the effectiveness of the proposed approach on two text classification tasks.
arXiv Detail & Related papers (2021-03-18T17:48:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.