Related papers: Training Data Attribution via Approximate Unrolled Differentiation

Training Data Attribution via Approximate Unrolled Differentiation

URL: http://arxiv.org/abs/2405.12186v2
Date: Tue, 21 May 2024 04:26:45 GMT
Title: Training Data Attribution via Approximate Unrolled Differentiation
Authors: Juhan Bae, Wu Lin, Jonathan Lorraine, Roger Grosse,
Abstract summary: Methods based on implicit differentiation, such as influence functions, can be made computationally efficient, but fail to account for underspecification. We introduce Source, an approximate unrolling-based TDA method that is computed using an influence-function-like formula.
Score: 8.87519936904341
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many training data attribution (TDA) methods aim to estimate how a model's behavior would change if one or more data points were removed from the training set. Methods based on implicit differentiation, such as influence functions, can be made computationally efficient, but fail to account for underspecification, the implicit bias of the optimization algorithm, or multi-stage training pipelines. By contrast, methods based on unrolling address these issues but face scalability challenges. In this work, we connect the implicit-differentiation-based and unrolling-based approaches and combine their benefits by introducing Source, an approximate unrolling-based TDA method that is computed using an influence-function-like formula. While being computationally efficient compared to unrolling-based approaches, Source is suitable in cases where implicit-differentiation-based approaches struggle, such as in non-converged models and multi-stage training pipelines. Empirically, Source outperforms existing TDA techniques in counterfactual prediction, especially in settings where implicit-differentiation-based approaches fall short.

Related papers

Expressive Score-Based Priors for Distribution Matching with Geometry-Preserving Regularization [10.432302605566331]
gradient-based DM training only requires the prior's score function -- not its density.<n>This approach eliminates biases from fixed priors, enabling more effective use of geometry-preserving regularization.<n>Our method also demonstrates better stability and computational efficiency compared to other diffusion-based priors.
arXiv Detail & Related papers (2025-06-17T15:08:16Z)
Addressing Correlated Latent Exogenous Variables in Debiased Recommender Systems [3.082385853653964]
Recommendation systems (RS) aim to provide personalized content, but they face a challenge in unbiased learning due to selection bias.<n>This paper proposes a learning algorithm based on likelihood to learn a prediction model.
arXiv Detail & Related papers (2025-06-09T07:50:21Z)
What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context [28.634315143647385]
Source-free domain adaptation (SFDA) involves adapting a model originally trained using a labeled dataset to perform effectively on an unlabeled dataset. This adaptation is especially crucial when significant disparities in data distributions exist between the two domains. We introduce a straightforward yet highly effective latent augmentation method tailored for contrastive SFDA.
arXiv Detail & Related papers (2024-12-18T20:09:46Z)
Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training. We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO. As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z)
Scalable Influence and Fact Tracing for Large Language Model Pretraining [14.598556308631018]
Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples. This paper refines existing gradient-based methods to work effectively at scale.
arXiv Detail & Related papers (2024-10-22T20:39:21Z)
Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
A Training-Free Conditional Diffusion Model for Learning Stochastic Dynamical Systems [10.820654486318336]
This study introduces a training-free conditional diffusion model for learning unknown differential equations (SDEs) using data. The proposed approach addresses key challenges in computational efficiency and accuracy for modeling SDEs. The learned models exhibit significant improvements in predicting both short-term and long-term behaviors of unknown systems.
arXiv Detail & Related papers (2024-10-04T03:07:36Z)
Source-Free Domain-Invariant Performance Prediction [68.39031800809553]
We propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data. Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability. Our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.
arXiv Detail & Related papers (2024-08-05T03:18:58Z)
Efficient Ensembles Improve Training Data Attribution [12.180392191924758]
Training data attribution methods aim to quantify the influence of individual data points on model predictions, with broad applications in data-centric AI. Existing methods in this field, which can be categorized as retraining-based and gradient-based methods, have struggled with naive trade-off attribution efficacy. Recent research has shown that augmenting gradient-based methods with ensembles of multiple independently trained models can achieve significantly better attribution.
arXiv Detail & Related papers (2024-05-27T15:58:34Z)
Nonparametric Automatic Differentiation Variational Inference with Spline Approximation [7.5620760132717795]
We develop a nonparametric approximation approach that enables flexible posterior approximation for distributions with complicated structures. Compared with widely-used nonparametrical inference methods, the proposed method is easy to implement and adaptive to various data structures. Experiments demonstrate the efficiency of the proposed method in approximating complex posterior distributions and improving the performance of generative models with incomplete data.
arXiv Detail & Related papers (2024-03-10T20:22:06Z)
Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP) Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem. Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem. We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.