DualXDA: Towards Sparse, Efficient and Explainable Data Attribution in Large AI Models
- URL: http://arxiv.org/abs/2402.12118v2
- Date: Thu, 24 Jul 2025 15:23:53 GMT
- Title: DualXDA: Towards Sparse, Efficient and Explainable Data Attribution in Large AI Models
- Authors: Galip Ümit Yolcu, Moritz Weckbecker, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin,
- Abstract summary: We introduce DualXDA, a framework for sparse, efficient and explainable Data Attribution (DualDA) and eXplainable Data Attribution (XDA)<n>With DualDA, we propose efficient and effective DA, leveraging Support Vector Machine theory to provide fast and naturally sparse data attributions for AI predictions.<n>We demonstrate that DualDA achieves high attribution quality, excels at solving a series of evaluated downstream tasks, and improves explanation time by a factor of up to 4,100,000$times$.
- Score: 14.148227936732399
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep learning models achieve remarkable performance, yet their decision-making processes often remain opaque. In response, the field of eXplainable Artificial Intelligence (XAI) has grown significantly over the last decade, primarily focusing on feature attribution methods. Complementing this perspective, Data Attribution (DA) has emerged as a promising paradigm that shifts the focus from features to data provenance. However, existing DA approaches suffer from prohibitively high computational costs and memory demands. Additionally, current attribution methods exhibit low sparsity, hindering the discovery of decisive patterns in the data. We introduce DualXDA, a framework for sparse, efficient and explainable DA, comprised of two interlinked approaches for Dual Data Attribution (DualDA) and eXplainable Data Attribution (XDA): With DualDA, we propose efficient and effective DA, leveraging Support Vector Machine theory to provide fast and naturally sparse data attributions for AI predictions. We demonstrate that DualDA achieves high attribution quality, excels at solving a series of evaluated downstream tasks, while at the same time improving explanation time by a factor of up to 4,100,000$\times$ compared to the original Influence Functions method, and up to 11,000$\times$ compared to the method's most efficient approximation from literature. We further introduce XDA, a method for enhancing Data Attribution with capabilities from feature attribution methods to explain why training samples are relevant for the prediction of a test sample in terms of impactful features. Taken together, our contributions in DualXDA ultimately point towards a future of eXplainable AI applied at unprecedented scale, enabling transparent, efficient and novel analysis of even the largest neural architectures fostering a new generation of accountable AI systems. Code at https://github.com/gumityolcu/DualXDA.
Related papers
- Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z) - Enhancing Training Data Attribution with Representational Optimization [57.61977909113113]
Training data attribution methods aim to measure how training data impacts a model's predictions.<n>We propose AirRep, a representation-based approach that closes this gap by learning task-specific and model-aligned representations explicitly for TDA.<n>AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence.
arXiv Detail & Related papers (2025-05-24T05:17:53Z) - A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning.
Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation.
We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z) - DANCE: Dual-View Distribution Alignment for Dataset Condensation [39.08022095906364]
We propose a new DM-based method named Dual-view distribution AligNment for dataset CondEnsation (DANCE)
Specifically, from the inner-class view, we construct multiple "middle encoders" to perform pseudo long-term distribution alignment.
While from the inter-class view, we use the expert models to perform distribution calibration.
arXiv Detail & Related papers (2024-06-03T07:22:17Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains [9.429772474335122]
We focus on scenarios where data distributions vary across multiple segments of the entire population.
We propose a two-stage multiply robust estimation method to improve model performance on each individual segment.
Our method is designed to be implemented with commonly used off-the-shelf machine learning models.
arXiv Detail & Related papers (2024-02-21T22:01:10Z) - Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model [8.06111903129142]
This paper proposes a miniblock diagonal hessian-free (Mini-Hes) optimization for building an LFA model.
Experiment results indicate that, with Mini-Hes, the LFA model outperforms several state-of-the-art models in addressing missing data estimation task.
arXiv Detail & Related papers (2024-02-19T08:43:00Z) - Adversarial Augmentation Training Makes Action Recognition Models More
Robust to Realistic Video Distribution Shifts [13.752169303624147]
Action recognition models often lack robustness when faced with natural distribution shifts between training and test data.
We propose two novel evaluation methods to assess model resilience to such distribution disparity.
We experimentally demonstrate the superior performance of the proposed adversarial augmentation approach over baselines across three state-of-the-art action recognition models.
arXiv Detail & Related papers (2024-01-21T05:50:39Z) - DiffusionEngine: Diffusion Model is Scalable Data Engine for Object
Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection.
DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Evaluating Representations with Readout Model Switching [19.907607374144167]
In this paper, we propose to use the Minimum Description Length (MDL) principle to devise an evaluation metric.
We design a hybrid discrete and continuous-valued model space for the readout models and employ a switching strategy to combine their predictions.
The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures.
arXiv Detail & Related papers (2023-02-19T14:08:01Z) - A classification performance evaluation measure considering data
separability [6.751026374812737]
We propose a new separability measure--the rate of separability (RS)--based on the data coding rate.
We demonstrate the positive correlation between the proposed measure and recognition accuracy in a multi-task scenario constructed from a real dataset.
arXiv Detail & Related papers (2022-11-10T09:18:26Z) - Value-Consistent Representation Learning for Data-Efficient
Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making.
Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values.
It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z) - EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance
Text Classification [34.15923302216751]
We present an easy and plug-in data augmentation framework EPiDA to support effective text classification.
EPiDA employs two mechanisms: relative entropy (REM) and conditional minimization entropy (CEM) to control data generation.
EPiDA can support efficient and continuous data generation for effective classification training.
arXiv Detail & Related papers (2022-04-24T06:53:48Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Interpretable pipelines with evolutionarily optimized modules for RL
tasks with visual inputs [5.254093731341154]
We propose end-to-end pipelines composed of multiple interpretable models co-optimized by means of evolutionary algorithms.
We test our approach in reinforcement learning environments from the Atari benchmark.
arXiv Detail & Related papers (2022-02-10T10:33:44Z) - Efficient training of lightweight neural networks using Online
Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner.
We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Self-supervised Co-training for Video Representation Learning [103.69904379356413]
We investigate the benefit of adding semantic-class positives to instance-based Info Noise Contrastive Estimation training.
We propose a novel self-supervised co-training scheme to improve the popular infoNCE loss.
We evaluate the quality of the learnt representation on two different downstream tasks: action recognition and video retrieval.
arXiv Detail & Related papers (2020-10-19T17:59:01Z) - DADA: Differentiable Automatic Data Augmentation [58.560309490774976]
We propose Differentiable Automatic Data Augmentation (DADA) which dramatically reduces the cost.
We conduct extensive experiments on CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets.
Results show our DADA is at least one order of magnitude faster than the state-of-the-art while achieving very comparable accuracy.
arXiv Detail & Related papers (2020-03-08T13:23:14Z) - Generalized ODIN: Detecting Out-of-distribution Image without Learning
from Out-of-distribution Data [87.61504710345528]
We propose two strategies for freeing a neural network from tuning with OoD data, while improving its OoD detection performance.
We specifically propose to decompose confidence scoring as well as a modified input pre-processing method.
Our further analysis on a larger scale image dataset shows that the two types of distribution shifts, specifically semantic shift and non-semantic shift, present a significant difference.
arXiv Detail & Related papers (2020-02-26T04:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.