Toward Efficient Influence Function: Dropout as a Compression Tool
- URL: http://arxiv.org/abs/2509.15651v1
- Date: Fri, 19 Sep 2025 06:20:54 GMT
- Title: Toward Efficient Influence Function: Dropout as a Compression Tool
- Authors: Yuchen Zhang, Mohammad Mohammadi Amiri,
- Abstract summary: We introduce a novel approach that leverages dropout as a gradient compression mechanism to compute the influence function more efficiently.<n>Our method significantly reduces computational and memory overhead, not only during the influence function but also in gradient compression process.
- Score: 9.756810956484772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Assessing the impact the training data on machine learning models is crucial for understanding the behavior of the model, enhancing the transparency, and selecting training data. Influence function provides a theoretical framework for quantifying the effect of training data points on model's performance given a specific test data. However, the computational and memory costs of influence function presents significant challenges, especially for large-scale models, even when using approximation methods, since the gradients involved in computation are as large as the model itself. In this work, we introduce a novel approach that leverages dropout as a gradient compression mechanism to compute the influence function more efficiently. Our method significantly reduces computational and memory overhead, not only during the influence function computation but also in gradient compression process. Through theoretical analysis and empirical validation, we demonstrate that our method could preserves critical components of the data influence and enables its application to modern large-scale models.
Related papers
- Z0-Inf: Zeroth Order Approximation for Data Influence [47.682602051124235]
We introduce a highly efficient zeroth-order approximation for estimating the influence of training data.<n>Our approach achieves superior accuracy in estimating self-influence and comparable or improved accuracy in estimating train-test influence for fine-tuned large language models.
arXiv Detail & Related papers (2025-10-13T18:30:37Z) - Revisiting Data Attribution for Influence Functions [13.88866465448849]
This paper comprehensively reviews the data attribution capability of influence functions in deep learning.<n>We discuss their theoretical foundations, recent algorithmic advances for efficient inverse-Hessian-vector product estimation, and evaluate their effectiveness for data attribution and mislabel detection.
arXiv Detail & Related papers (2025-08-10T11:15:07Z) - Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z) - Small-to-Large Generalization: Data Influences Models Consistently Across Scale [76.87199303408161]
We find that small- and large-scale language model predictions (generally) do highly correlate across choice of training data.<n>We also characterize how proxy scale affects effectiveness in two downstream proxy model applications: data attribution and dataset selection.
arXiv Detail & Related papers (2025-05-22T05:50:19Z) - Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training.<n>We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO.<n>As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z) - The Approximate Fisher Influence Function: Faster Estimation of Data Influence in Statistical Models [5.893124686141781]
Quantifying the influence of infinitesimal reform changes in model performance is crucial for understanding and improving machine learning models.<n>We show that our method offers significant computational advantages over current methods.
arXiv Detail & Related papers (2024-07-11T04:19:28Z) - Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models [36.05242956018461]
In this paper, we establish a bridge between identifying detrimental training samples via influence functions and outlier gradient detection.<n>We first validate the hypothesis of our proposed outlier gradient analysis approach on synthetic datasets.<n>We then demonstrate its effectiveness in detecting mislabeled samples in vision models and selecting data samples for improving performance of natural language processing transformer models.
arXiv Detail & Related papers (2024-05-06T21:34:46Z) - Uncovering the Hidden Cost of Model Compression [43.62624133952414]
Visual Prompting has emerged as a pivotal method for transfer learning in computer vision.
Model compression detrimentally impacts the performance of visual prompting-based transfer.
However, negative effects on calibration are not present when models are compressed via quantization.
arXiv Detail & Related papers (2023-08-29T01:47:49Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.