Do Influence Functions Work on Large Language Models?
- URL: http://arxiv.org/abs/2409.19998v1
- Date: Mon, 30 Sep 2024 06:50:18 GMT
- Title: Do Influence Functions Work on Large Language Models?
- Authors: Zhe Li, Wei Zhao, Yige Li, Jun Sun,
- Abstract summary: Influence functions aim to quantify the impact of individual training data points on a model's predictions.
We evaluate influence functions across multiple tasks and find that they consistently perform poorly in most settings.
- Score: 10.463762448166714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Influence functions aim to quantify the impact of individual training data points on a model's predictions. While extensive research has been conducted on influence functions in traditional machine learning models, their application to large language models (LLMs) has been limited. In this work, we conduct a systematic study to address a key question: do influence functions work on LLMs? Specifically, we evaluate influence functions across multiple tasks and find that they consistently perform poorly in most settings. Our further investigation reveals that their poor performance can be attributed to: (1) inevitable approximation errors when estimating the iHVP component due to the scale of LLMs, (2) uncertain convergence during fine-tuning, and, more fundamentally, (3) the definition itself, as changes in model parameters do not necessarily correlate with changes in LLM behavior. Our study thus suggests the need for alternative approaches for identifying influential samples. To support future work, our code is made available at https://github.com/plumprc/Failures-of-Influence-Functions-in-LLMs.
Related papers
- Dissecting Misalignment of Multimodal Large Language Models via Influence Function [12.832792175138241]
We introduce the Extended Influence Function for Contrastive Loss (ECIF), an influence function crafted for contrastive loss.
ECIF considers both positive and negative samples and provides a closed-form approximation of contrastive learning models.
Building upon ECIF, we develop a series of algorithms for data evaluation in MLLM, misalignment detection, and misprediction trace-back tasks.
arXiv Detail & Related papers (2024-11-18T15:45:41Z) - From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks? [51.42906577386907]
This study explores the factors influencing the performance of Large Language Models (LLMs) in causal discovery tasks.
A higher frequency of causal mentions correlates with better model performance, suggesting that extensive exposure to causal information during training enhances the models' causal discovery capabilities.
arXiv Detail & Related papers (2024-07-29T01:45:05Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) can estimate causal effects under interventions on different parts of a system.
We conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.
We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - CogBench: a large language model walks into a psychology lab [12.981407327149679]
This paper introduces CogBench, a benchmark that includes ten behavioral metrics derived from seven cognitive psychology experiments.
We apply CogBench to 35 large language models (LLMs) and analyze this data using statistical multilevel modeling techniques.
We find that open-source models are less risk-prone than proprietary models and that fine-tuning on code does not necessarily enhance LLMs' behavior.
arXiv Detail & Related papers (2024-02-28T10:43:54Z) - Studying Large Language Model Generalization with Influence Functions [29.577692176892135]
Influence functions aim to answer a counterfactual: how would the model's parameters (and hence its outputs) change if a sequence were added to the training set?
We use the Eigenvalue-corrected Kronecker-Factored Approximate Curvature (EK-FAC) approximation to scale influence functions up to large language models (LLMs) with up to 52 billion parameters.
We investigate generalization patterns of LLMs, including the sparsity of the influence patterns, increasing abstraction with scale, math and programming abilities, cross-lingual generalization, and role-playing behavior.
arXiv Detail & Related papers (2023-08-07T04:47:42Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - If Influence Functions are the Answer, Then What is the Question? [7.873458431535409]
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters.
While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks.
arXiv Detail & Related papers (2022-09-12T16:17:43Z) - FastIF: Scalable Influence Functions for Efficient Model Interpretation
and Debugging [112.19994766375231]
Influence functions approximate the 'influences' of training data-points for test predictions.
We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time.
Our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors.
arXiv Detail & Related papers (2020-12-31T18:02:34Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z) - Explaining Black Box Predictions and Unveiling Data Artifacts through
Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples.
We conduct a comparison between influence functions and common word-saliency methods on representative tasks.
We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.