Related papers: Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

URL: http://arxiv.org/abs/2405.03869v2
Date: Sun, 12 May 2024 20:20:57 GMT
Title: Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions
Authors: Anshuman Chhabra, Bo Li, Jian Chen, Prasant Mohapatra, Hongfu Liu,
Abstract summary: Influence functions offer a robust tool in assessing the impact of each data sample on model predictions. This paper focuses on a data-centric scenario--trimming outliers--and both challenges within a unified framework.
Score: 36.05242956018461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Influence functions offer a robust framework for assessing the impact of each training data sample on model predictions, serving as a prominent tool in data-centric learning. Despite their widespread use in various tasks, the strong convexity assumption on the model and the computational cost associated with calculating the inverse of the Hessian matrix pose constraints, particularly when analyzing large deep models. This paper focuses on a classical data-centric scenario--trimming detrimental samples--and addresses both challenges within a unified framework. Specifically, we establish an equivalence transformation between identifying detrimental training samples via influence functions and outlier gradient detection. This transformation not only presents a straightforward and Hessian-free formulation but also provides profound insights into the role of the gradient in sample impact. Moreover, it relaxes the convexity assumption of influence functions, extending their applicability to non-convex deep models. Through systematic empirical evaluations, we first validate the correctness of our proposed outlier gradient analysis on synthetic datasets and then demonstrate its effectiveness in detecting mislabeled samples in vision models, selecting data samples for improving performance of transformer models for natural language processing, and identifying influential samples for fine-tuned Large Language Models.

Related papers

Dissecting Representation Misalignment in Contrastive Learning via Influence Function [15.28417468377201]
We introduce the Extended Influence Function for Contrastive Loss (ECIF), an influence function crafted for contrastive loss. ECIF considers both positive and negative samples and provides a closed-form approximation of contrastive learning models. Building upon ECIF, we develop a series of algorithms for data evaluation, misalignment detection, and misprediction trace-back tasks.
arXiv Detail & Related papers (2024-11-18T15:45:41Z)
Complementary Learning for Real-World Model Failure Detection [15.779651238128562]
We introduce complementary learning, where we use learned characteristics from different training paradigms to detect model errors. We demonstrate our approach by learning semantic and predictive motion labels in point clouds in a supervised and self-supervised manner. We perform a large-scale qualitative analysis and present LidarCODA, the first dataset with labeled anomalies in lidar point clouds.
arXiv Detail & Related papers (2024-07-19T13:36:35Z)
Revisit, Extend, and Enhance Hessian-Free Influence Functions [26.105554752277648]
Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, and more. In this paper, we revisit a specific, albeit effective approximation method known as Trac. This method substitutes the inverse of the Hessian matrix with an identity matrix.
arXiv Detail & Related papers (2024-05-25T03:43:36Z)
The Importance of Model Inspection for Better Understanding Performance Characteristics of Graph Neural Networks [15.569758991934934]
We investigate the effect of modelling choices on the feature learning characteristics of graph neural networks applied to a brain shape classification task. We find substantial differences in the feature embeddings at different layers of the models.
arXiv Detail & Related papers (2024-05-02T13:26:18Z)
Distilled Datamodel with Reverse Gradient Matching [74.75248610868685]
We introduce an efficient framework for assessing data impact, comprising offline training and online evaluation stages. Our proposed method achieves comparable model behavior evaluation while significantly speeding up the process compared to the direct retraining method.
arXiv Detail & Related papers (2024-04-22T09:16:14Z)
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes [30.30769701138665]
We introduce and explore the Mirrored Influence Hypothesis, highlighting a reciprocal nature of influence between training and test data. Specifically, it suggests that evaluating the influence of training data on test predictions can be reformulated as an equivalent, yet inverse problem. We introduce a new method for estimating the influence of training data, which requires calculating gradients for specific test samples, paired with a forward pass for each training point.
arXiv Detail & Related papers (2024-02-14T03:43:05Z)
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts. We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep. We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
Gradient Surgery for One-shot Unlearning on Generative Model [0.989293617504294]
We introduce a simple yet effective approach to remove a data influence on the deep generative model. Inspired by works in multi-task learning, we propose to manipulate gradients to regularize the interplay of influence among samples.
arXiv Detail & Related papers (2023-07-10T13:29:23Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy. We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples. Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z)
FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set where the sample weights are computed. We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.