Related papers: What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

URL: http://arxiv.org/abs/2402.01865v2
Date: Thu, 20 Jun 2024 06:25:39 GMT
Title: What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
Authors: Xisen Jin, Xiang Ren,
Abstract summary: Language models deployed in the wild make errors. Updating the model with the corrected error instances causes catastrophic forgetting. We propose a partially interpretable forecasting model based on the observation that changes in pre-softmax logit scores of pretraining examples resemble that of online learned examples.
Score: 38.93348195407474
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Language models deployed in the wild make errors. However, simply updating the model with the corrected error instances causes catastrophic forgetting -- the updated model makes errors on instances learned during the instruction tuning or upstream training phase. Randomly replaying upstream data yields unsatisfactory performance and often comes with high variance and poor controllability. To this end, we try to forecast upstream examples that will be forgotten due to a model update for improved controllability of the replay process and interpretability. We train forecasting models given a collection of online learned examples and corresponding forgotten upstream pre-training examples. We propose a partially interpretable forecasting model based on the observation that changes in pre-softmax logit scores of pretraining examples resemble that of online learned examples, which performs decently on BART but fails on T5 models. We further show a black-box classifier based on inner products of example representations achieves better forecasting performance over a series of setups. Finally, we show that we reduce forgetting of upstream pretraining examples by replaying examples that are forecasted to be forgotten, demonstrating the practical utility of forecasting example forgetting.

Related papers

Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services [0.0]
We optimize a state-of-the-art foundational model to forecast sporadic or spiky production outages of high-performance machine learning services.<n>The analysis helps us understand how each of the evaluated models performs for the sporadic or spiky events.<n>We use the models with optimal parameters to estimate a year-long outage statistics of a particular root cause with less than 6% value errors.
arXiv Detail & Related papers (2025-06-30T23:59:12Z)
Overtrained Language Models Are Harder to Fine-Tune [64.44743256512237]
Large language models are pre-trained on ever-growing token budgets. We show that extended pre-training can make models harder to fine-tune, leading to degraded final performance.
arXiv Detail & Related papers (2025-03-24T23:11:56Z)
AMUN: Adversarial Machine UNlearning [13.776549741449557]
Adversarial Machine UNlearning (AMUN) outperforms prior state-of-the-art (SOTA) methods for image classification. AMUN lowers the confidence of the model on the forget samples by fine-tuning the model on their corresponding adversarial examples.
arXiv Detail & Related papers (2025-03-02T14:36:31Z)
Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning. By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z)
Demystifying Language Model Forgetting with Low-rank Example Associations [38.93348195407474]
Large Language models (LLMs) suffer from forgetting of upstream data when fine-tuned. We empirically analyze forgetting that occurs in $N$ upstream examples of language modeling or instruction-tuning after fine-tuning.
arXiv Detail & Related papers (2024-06-20T06:46:23Z)
Post-Hoc Reversal: Are We Selecting Models Prematurely? [13.910702424593797]
We show a phenomenon that we call post-hoc reversal, where performance trends are reversed after applying post-hoc transforms. Preliminary analyses suggest that these transforms induce reversal by suppressing the influence of mislabeled examples. We propose post-hoc selection, a simple technique whereby post-hoc metrics inform model development decisions.
arXiv Detail & Related papers (2024-04-11T14:58:19Z)
Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models. We present theoretical results on the expected churn between models within the Rashomon set. We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z)
Measuring Forgetting of Memorized Training Examples [80.9188503645436]
We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization. In specific examples, models overfit specific training and become susceptible to privacy attacks by the end. We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
arXiv Detail & Related papers (2022-06-30T20:48:26Z)
Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels. Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features. These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z)
Training Deep Models to be Explained with Fewer Examples [40.58343220792933]
We train prediction and explanation models simultaneously with a sparse regularizer for reducing the number of examples. Experiments using several datasets demonstrate that the proposed method improves faithfulness while keeping the predictive performance.
arXiv Detail & Related papers (2021-12-07T05:39:21Z)
Explainable boosted linear regression for time series forecasting [0.1876920697241348]
Time series forecasting involves collecting and analyzing past observations to develop a model to extrapolate such observations into the future. We propose explainable boosted linear regression (EBLR) algorithm for time series forecasting.
arXiv Detail & Related papers (2020-09-18T22:31:42Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.