What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
- URL: http://arxiv.org/abs/2402.01865v2
- Date: Thu, 20 Jun 2024 06:25:39 GMT
- Title: What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
- Authors: Xisen Jin, Xiang Ren,
- Abstract summary: Language models deployed in the wild make errors.
Updating the model with the corrected error instances causes catastrophic forgetting.
We propose a partially interpretable forecasting model based on the observation that changes in pre-softmax logit scores of pretraining examples resemble that of online learned examples.
- Score: 38.93348195407474
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Language models deployed in the wild make errors. However, simply updating the model with the corrected error instances causes catastrophic forgetting -- the updated model makes errors on instances learned during the instruction tuning or upstream training phase. Randomly replaying upstream data yields unsatisfactory performance and often comes with high variance and poor controllability. To this end, we try to forecast upstream examples that will be forgotten due to a model update for improved controllability of the replay process and interpretability. We train forecasting models given a collection of online learned examples and corresponding forgotten upstream pre-training examples. We propose a partially interpretable forecasting model based on the observation that changes in pre-softmax logit scores of pretraining examples resemble that of online learned examples, which performs decently on BART but fails on T5 models. We further show a black-box classifier based on inner products of example representations achieves better forecasting performance over a series of setups. Finally, we show that we reduce forgetting of upstream pretraining examples by replaying examples that are forecasted to be forgotten, demonstrating the practical utility of forecasting example forgetting.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Demystifying Language Model Forgetting with Low-rank Example Associations [38.93348195407474]
Large Language models (LLMs) suffer from forgetting of upstream data when fine-tuned.
We empirically analyze forgetting that occurs in $N$ upstream examples of language modeling or instruction-tuning after fine-tuning.
arXiv Detail & Related papers (2024-06-20T06:46:23Z) - Post-Hoc Reversal: Are We Selecting Models Prematurely? [13.910702424593797]
We show a phenomenon that we call post-hoc reversal, where performance trends are reversed after applying post-hoc transforms.
Preliminary analyses suggest that these transforms induce reversal by suppressing the influence of mislabeled examples.
We propose post-hoc selection, a simple technique whereby post-hoc metrics inform model development decisions.
arXiv Detail & Related papers (2024-04-11T14:58:19Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Measuring Forgetting of Memorized Training Examples [80.9188503645436]
We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization.
In specific examples, models overfit specific training and become susceptible to privacy attacks by the end.
We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
arXiv Detail & Related papers (2022-06-30T20:48:26Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Training Deep Models to be Explained with Fewer Examples [40.58343220792933]
We train prediction and explanation models simultaneously with a sparse regularizer for reducing the number of examples.
Experiments using several datasets demonstrate that the proposed method improves faithfulness while keeping the predictive performance.
arXiv Detail & Related papers (2021-12-07T05:39:21Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.