Talking to Yourself: Defying Forgetting in Large Language Models
- URL: http://arxiv.org/abs/2602.20162v1
- Date: Fri, 23 Jan 2026 14:25:49 GMT
- Title: Talking to Yourself: Defying Forgetting in Large Language Models
- Authors: Yutao Sun, Mingshuai Chen, Tiancheng Zhao, Phillip Miao, Zilun Zhang, Haozhan Shen, Ruizhe Zhu, Jianwei Yin,
- Abstract summary: Catastrophic forgetting is a major challenge when fine-tuning large language models on narrow, task-specific data.<n>We propose SA-SFT, a lightweight self-augmentation routine in which an LLM generates self-dialogues prior to fine-tuning.<n>Despite requiring no external data or additional tuning, SA-SFT consistently mitigates catastrophic forgetting while improving in-domain performance.
- Score: 35.20586233788621
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Catastrophic forgetting remains a major challenge when fine-tuning large language models (LLMs) on narrow, task-specific data, often degrading their general knowledge and reasoning abilities. We propose SA-SFT, a lightweight self-augmentation routine in which an LLM generates self-dialogues prior to fine-tuning, and the resulting self-authored data are mixed with task data without modifying optimization or training schedules. Despite requiring no external data or additional tuning, SA-SFT consistently mitigates catastrophic forgetting while improving in-domain performance. Across 50 evaluation scenarios, it maintains performance comparable to the original model and achieves the best results in 40 cases, outperforming common baselines such as layer freezing and external data mixing. Guided by these empirical findings, we further present a theoretical analysis suggesting that forgetting can partly stem from style-induced parameter drift, and that self-alignment through self-generated data provides an effective means to counteract this effect. Overall, our results indicate that self-augmentation offers a simple and effective mechanism for robust LLM adaptation without incurring catastrophic forgetting.
Related papers
- Improving the Robustness of Large Language Models for Code Tasks via Fine-tuning with Perturbed Data [10.698357983420928]
This work aims to improve the robustness of Large Language Models against potential adversarial inputs.<n>We systematically evaluated robustness by fine-tuning models using datasets perturbed at character-level, word-level, and sentence-level.<n>Fine-tuning models with perturbed datasets significantly improves model robustness (RD usually drops around 4% - 6%), especially for models with relatively weak robustness.
arXiv Detail & Related papers (2026-02-11T22:30:01Z) - Learning from the Undesirable: Robust Adaptation of Language Models without Forgetting [18.680059467974825]
Language models (LMs) are often adapted through supervised fine-tuning (SFT) to specialize their capabilities for downstream tasks.<n>In typical scenarios where the fine-tuning data is limited, SFT can lead LMs to overfit, causing them to rely on spurious patterns.<n>We propose Learning-from-the-Undesirable (LfU), a simple yet effective regularization scheme for SFT to mitigate issues when fine-tuning LMs with limited data.
arXiv Detail & Related papers (2025-11-17T06:57:44Z) - Learn More, Forget Less: A Gradient-Aware Data Selection Approach for LLM [51.21051698747157]
We propose a self-adaptive gradient-aware data selection approach (GrADS) for supervised fine-tuning of large language models (LLMs)<n>Specifically, we design self-guided criteria that leverage the magnitude and statistical distribution of gradients to prioritize examples that contribute the most to the model's learning process.<n>Through extensive experimentation with various LLMs across diverse domains such as medicine, law, and finance, GrADS has demonstrated significant efficiency and cost-effectiveness.
arXiv Detail & Related papers (2025-11-07T08:34:50Z) - Improved Supervised Fine-Tuning for Large Language Models to Mitigate Catastrophic Forgetting [1.5595148909011116]
Supervised Fine-Tuning (SFT) is a critical step for enhancing the instruction-following capabilities of Large Language Models (LLMs)<n>SFT often leads to a degradation of the model's general abilities, a phenomenon known as catastrophic forgetting.<n>We propose a novel and cost-effective SFT method that effectively mitigates catastrophic forgetting without requiring access to the original SFT data.
arXiv Detail & Related papers (2025-06-11T06:23:50Z) - SLearnLLM: A Self-Learning Framework for Efficient Domain-Specific Adaptation of Large Language Models [7.44035983292392]
We propose a self-learning framework for large language models (LLMs) inspired by human learning pattern.<n>This framework takes a fine-tuning (SFT) dataset in a specific domain as input.<n>We show that our method substantially reduces training time while achieving comparable improvements to those attained with full dataset fine-tuning.
arXiv Detail & Related papers (2025-05-23T04:50:54Z) - S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning [51.84977135926156]
We introduce S$2$R, an efficient framework that enhances LLM reasoning by teaching models to self-verify and self-correct during inference.<n>Our results demonstrate that Qwen2.5-math-7B achieves an accuracy improvement from 51.0% to 81.6%, outperforming models trained on an equivalent amount of long-CoT distilled data.
arXiv Detail & Related papers (2025-02-18T13:40:22Z) - Adaptive Rank Allocation for Federated Parameter-Efficient Fine-Tuning of Language Models [40.69348434971122]
We propose FedARA, a novel Adaptive Rank Allocation framework for federated parameter-efficient fine-tuning of language models.<n>FedARA consistently outperforms baselines by an average of 6.95% to 8.49% across various datasets and models under heterogeneous data.<n>Experiments on various edge devices demonstrate substantial decreases in total training time and energy consumption by up to 48.90% and 46.95%, respectively.
arXiv Detail & Related papers (2025-01-24T11:19:07Z) - Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning [65.23593936798662]
We show that fine-tuning with LLM-generated data improves target task performance and reduces non-target task degradation.<n>This is the first work to provide an empirical explanation based on token perplexity reduction to mitigate catastrophic forgetting in LLMs after fine-tuning.
arXiv Detail & Related papers (2025-01-24T08:18:56Z) - iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use [56.31110409360567]
Augmenting large language models with external tools is a promising approach to enhance their capabilities.<n>We show that training gains significantly decay as synthetic data increases.<n>We propose an iterative reinforced fine-tuning strategy designed to alleviate this limitation.
arXiv Detail & Related papers (2025-01-15T04:52:34Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.