Enhancing Interpretability for Vision Models via Shapley Value Optimization
- URL: http://arxiv.org/abs/2512.14354v1
- Date: Tue, 16 Dec 2025 12:33:04 GMT
- Title: Enhancing Interpretability for Vision Models via Shapley Value Optimization
- Authors: Kanglong Fan, Yunqiao Yang, Chen Ma,
- Abstract summary: Self-explaining neural networks sacrifice performance and compatibility due to their specialized architectural designs.<n>We propose a novel self-explaining framework that integrates Shapley value estimation as an auxiliary task during training.<n>Our method achieves state-of-the-art interpretability.
- Score: 10.809438356590988
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep neural networks have demonstrated remarkable performance across various domains, yet their decision-making processes remain opaque. Although many explanation methods are dedicated to bringing the obscurity of DNNs to light, they exhibit significant limitations: post-hoc explanation methods often struggle to faithfully reflect model behaviors, while self-explaining neural networks sacrifice performance and compatibility due to their specialized architectural designs. To address these challenges, we propose a novel self-explaining framework that integrates Shapley value estimation as an auxiliary task during training, which achieves two key advancements: 1) a fair allocation of the model prediction scores to image patches, ensuring explanations inherently align with the model's decision logic, and 2) enhanced interpretability with minor structural modifications, preserving model performance and compatibility. Extensive experiments on multiple benchmarks demonstrate that our method achieves state-of-the-art interpretability.
Related papers
- Clarity: The Flexibility-Interpretability Trade-Off in Sparsity-aware Concept Bottleneck Models [12.322360020814516]
Vision-Language Models (VLMs) are often treated as black-boxes, with limited or non-existent investigation of their decision making process.<n>We introduce the notion of clarity, a measure, capturing the interplay between the downstream performance and the sparsity and precision of the concept representation.<n>Our experiments reveal a critical trade-off between flexibility and interpretability, under which a given method can exhibit markedly different behaviors even at comparable performance levels.
arXiv Detail & Related papers (2026-01-29T16:28:55Z) - MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings [75.0617088717528]
MoCa is a framework for transforming pre-trained VLM backbones into effective bidirectional embedding models.<n>MoCa consistently improves performance across MMEB and ViDoRe-v2 benchmarks, achieving new state-of-the-art results.
arXiv Detail & Related papers (2025-06-29T06:41:00Z) - Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts [79.18608192761512]
Self-Explainable Models (SEMs) rely on Prototypical Concept Learning (PCL) to enable their visual recognition processes more interpretable.<n>We propose a Few-Shot Prototypical Concept Classification framework that mitigates two key challenges under low-data regimes: parametric imbalance and representation misalignment.<n>Our approach consistently outperforms existing SEMs by a notable margin, with 4.2%-8.7% relative gains in 5-way 5-shot classification.
arXiv Detail & Related papers (2025-06-05T06:39:43Z) - Enhancing Performance of Explainable AI Models with Constrained Concept Refinement [10.241134756773228]
Trade-off between accuracy and interpretability has long been a challenge in machine learning (ML)<n>In this paper, we investigate the impact of deviations in concept representations and propose a novel framework to mitigate these effects.<n>Compared to existing explainable methods, our approach not only improves prediction accuracy while preserving model interpretability across various large-scale benchmarks but also achieves this with significantly lower computational cost.
arXiv Detail & Related papers (2025-02-10T18:53:15Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - LLM-based Hierarchical Concept Decomposition for Interpretable Fine-Grained Image Classification [5.8754760054410955]
We introduce textttHi-CoDecomposition, a novel framework designed to enhance model interpretability through structured concept analysis.
Our approach not only aligns with the performance of state-of-the-art models but also advances transparency by providing clear insights into the decision-making process.
arXiv Detail & Related papers (2024-05-29T00:36:56Z) - Model-Agnostic Interpretation Framework in Machine Learning: A
Comparative Study in NBA Sports [0.2937071029942259]
We propose an innovative framework to reconcile the trade-off between model performance and interpretability.
Our approach is centered around modular operations on high-dimensional data, which enable end-to-end processing while preserving interpretability.
We have extensively tested our framework and validated its superior efficacy in achieving a balance between computational efficiency and interpretability.
arXiv Detail & Related papers (2024-01-05T04:25:21Z) - Consistent Explanations in the Face of Model Indeterminacy via
Ensembling [12.661530681518899]
This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy.
We introduce ensemble methods to enhance the consistency of the explanations provided in these scenarios.
Our findings highlight the importance of considering model indeterminacy when interpreting explanations.
arXiv Detail & Related papers (2023-06-09T18:45:43Z) - Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Interpretable Detail-Fidelity Attention Network for Single Image
Super-Resolution [89.1947690981471]
We propose a purposeful and interpretable detail-fidelity attention network to progressively process smoothes and details in divide-and-conquer manner.
Particularly, we propose a Hessian filtering for interpretable feature representation which is high-profile for detail inference.
Experiments demonstrate that the proposed methods achieve superior performances over the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-28T08:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.