Correlation-Aware Feature Attribution Based Explainable AI
- URL: http://arxiv.org/abs/2511.16482v1
- Date: Thu, 20 Nov 2025 15:51:00 GMT
- Title: Correlation-Aware Feature Attribution Based Explainable AI
- Authors: Poushali Sengupta, Yan Zhang, Frank Eliassen, Sabita Maharjan,
- Abstract summary: emphExCIR is a correlation-aware attribution score equipped with a lightweight transfer protocol.<n>textscBlockCIR mitigates double-counting in collinear clusters.<n>emphscalable explainability provides emphcomputationally efficient, emphconsistent, and emphscalable explainability for real-world deployment.
- Score: 4.457502798302293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Explainable AI (XAI) is increasingly essential as modern models become more complex and high-stakes applications demand transparency, trust, and regulatory compliance. Existing global attribution methods often incur high computational costs, lack stability under correlated inputs, and fail to scale efficiently to large or heterogeneous datasets. We address these gaps with \emph{ExCIR} (Explainability through Correlation Impact Ratio), a correlation-aware attribution score equipped with a lightweight transfer protocol that reproduces full-model rankings using only a fraction of the data. ExCIR quantifies sign-aligned co-movement between features and model outputs after \emph{robust centering} (subtracting a robust location estimate, e.g., median or mid-mean, from features and outputs). We further introduce \textsc{BlockCIR}, a \emph{groupwise} extension of ExCIR that scores \emph{sets} of correlated features as a single unit. By aggregating the same signed-co-movement numerators and magnitudes over predefined or data-driven groups, \textsc{BlockCIR} mitigates double-counting in collinear clusters (e.g., synonyms or duplicated sensors) and yields smoother, more stable rankings when strong dependencies are present. Across diverse text, tabular, signal, and image datasets, ExCIR shows trustworthy agreement with established global baselines and the full model, delivers consistent top-$k$ rankings across settings, and reduces runtime via lightweight evaluation on a subset of rows. Overall, ExCIR provides \emph{computationally efficient}, \emph{consistent}, and \emph{scalable} explainability for real-world deployment.
Related papers
- Generative Data Transformation: From Mixed to Unified Data [57.84692191369066]
textscTaesar is a emphdata-centric framework for textbftarget-textbfal textbfregeneration.<n>It encodes cross-domain context into target sequences, enabling standard models to learn intricate dependencies without complex fusion architectures.
arXiv Detail & Related papers (2026-02-26T08:30:09Z) - Explainability of Complex AI Models with Correlation Impact Ratio [10.61008729196936]
Complex AI systems make better predictions but often lack transparency, limiting trustworthiness, interpretability, and safe deployment.<n>We introduce ExCIR (Explainability through Correlation Impact Ratio), a theoretically grounded, simple, and reliable metric for explaining the contribution of input features to model outputs.<n>We demonstrate that ExCIR captures dependencies arising from correlated features through a lightweight single pass formulation.
arXiv Detail & Related papers (2026-01-10T21:56:24Z) - From Feature Interaction to Feature Generation: A Generative Paradigm of CTR Prediction Models [81.43473418572567]
Click-Through Rate (CTR) prediction is a core task in recommendation systems.<n>We propose a novel generative framework to address embedding dimensional collapse and information redundancy.<n>We show that SFG consistently mitigates embedding collapse and reduces information redundancy, while yielding substantial performance gains.
arXiv Detail & Related papers (2025-12-16T03:17:18Z) - Evaluating Knowledge Graph Complexity via Semantic, Spectral, and Structural Metrics for Link Prediction [0.0]
We introduce and benchmark a set of structural and semantic KG complexity metrics.<n>We find that CSG is highly sensitive to parametrisation and does not robustly scale with the number of classes.<n>Our results demonstrate that CSGs purported stability and generalization predictive power fail to hold in link prediction settings.
arXiv Detail & Related papers (2025-08-21T06:27:20Z) - HOPSE: Scalable Higher-Order Positional and Structural Encoder for Combinatorial Representations [7.494692635491467]
Topological Deep Learning (TDL) uses more general representations to accommodate higher-order interactions.<n>Existing TDL methods often extend GNNs through Higher-Order Message Passing (HOMP)<n>This work presents HOPSE, an alternative method to solve tasks involving higher-order relational interactions.
arXiv Detail & Related papers (2025-05-21T11:47:40Z) - TD3: Tucker Decomposition Based Dataset Distillation Method for Sequential Recommendation [50.23504065567638]
This paper introduces textbfTD3, a novel textbfDataset textbfDistillation method within a meta-learning framework.<n> TD3 distills a fully expressive emphsynthetic sequence summary from original data.<n>An augmentation technique allows the learner to closely fit the synthetic summary, ensuring an accurate update of it in the emphouter-loop.
arXiv Detail & Related papers (2025-02-05T03:13:25Z) - Context-Aware Hierarchical Merging for Long Document Summarization [56.96619074316232]
We propose different approaches to enrich hierarchical merging with context from the source document.<n> Experimental results on datasets representing legal and narrative domains show that contextual augmentation consistently outperforms zero-shot and hierarchical merging baselines.
arXiv Detail & Related papers (2025-02-03T01:14:31Z) - SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - SMaRt: Improving GANs with Score Matching Regularity [114.43433222721025]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.<n>We find that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.<n>We show that our approach can consistently boost the performance of various state-of-the-art GANs on real-world datasets with pre-trained diffusion models acting as the approximate score function.
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Flag Aggregator: Scalable Distributed Training under Failures and
Augmented Losses using Convex Optimization [14.732408788010313]
ML applications increasingly rely on complex deep learning models and large datasets.
To scale computation and data, these models are inevitably trained in a distributed manner in clusters of nodes, and their updates are aggregated before being applied to the model.
With data augmentation added to these settings, there is a critical need for robust and efficient aggregation systems.
We show that our approach significantly enhances the robustness of state-of-the-art Byzantine resilient aggregators.
arXiv Detail & Related papers (2023-02-12T06:38:30Z) - Disentanglement and Generalization Under Correlation Shifts [22.499106910581958]
Correlations between factors of variation are prevalent in real-world data.
Machine learning algorithms may benefit from exploiting such correlations, as they can increase predictive performance on noisy data.
We aim to learn representations which capture different factors of variation in latent subspaces.
arXiv Detail & Related papers (2021-12-29T18:55:17Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.