Reinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-Training
- URL: http://arxiv.org/abs/2507.05386v1
- Date: Mon, 07 Jul 2025 18:17:06 GMT
- Title: Reinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-Training
- Authors: Song Lai, Haohan Zhao, Rong Feng, Changyi Ma, Wenzhuo Liu, Hongbo Zhao, Xi Lin, Dong Yi, Min Xie, Qingfu Zhang, Hongbin Liu, Gaofeng Meng, Fei Zhu,
- Abstract summary: This paper presents a comparative analysis of two core post-training paradigms: supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT)<n>Our experiments are conducted on a benchmark comprising seven diverse multimodal tasks.
- Score: 23.99424961055015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual post-training (CPT) is a popular and effective technique for adapting foundation models like multimodal large language models to specific and ever-evolving downstream tasks. While existing research has primarily concentrated on methods like data replay, model expansion, or parameter regularization, the fundamental role of the learning paradigm within CPT remains largely unexplored. This paper presents a comparative analysis of two core post-training paradigms: supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT), investigating their respective impacts on knowledge retention during CPT. Our experiments are conducted on a benchmark comprising seven diverse multimodal tasks, utilizing Qwen2.5-VL-7B-Instruct as the base model for continual post-training. The investigation yields two significant findings: (1) When continuously learning on downstream tasks, SFT leads to catastrophic forgetting of previously learned tasks. In contrast, RFT inherently preserves prior knowledge and achieve performance comparable to multi-task training. (2) RFT successfully protects and even enhances the model's general knowledge on standard benchmarks (e.g., MMMU and MMLU-Pro). Conversely, SFT degrades general model capabilities severely. Further analysis shows that explicit mechanisms, such as KL penalty and chain-of-thought reasoning, are not the primary factors. Instead, we find that the implicit regularization inherent to RFT is a key factor in mitigating forgetting. Finally, we propose a rollout-based instance filtering algorithm to improve the stability and efficiency of RFT. Our comprehensive study demonstrates the superiority of RFT as a robust paradigm for continual post-training.
Related papers
- Revisiting Continual Semantic Segmentation with Pre-trained Vision Models [53.56065605992639]
Continual Semantic (CSS) seeks to incrementally learn to segment novel classes while preserving knowledge of previously encountered ones.<n>Recent advancements in CSS have been driven by the adoption of Pre-trained Vision Models (PVMs) as backbones.<n>Among existing strategies, Direct Fine-Tuning (DFT), which sequentially fine-tunes the model across classes, remains the most straightforward approach.
arXiv Detail & Related papers (2025-08-06T09:51:46Z) - Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling [35.64557242726578]
Prefix-RFT is a hybrid approach that synergizes learning from both demonstration and exploration.<n>It not only surpasses the performance of standalone SFT and RFT but also outperforms parallel mixed-policy RFT methods.
arXiv Detail & Related papers (2025-07-02T13:04:09Z) - Reinforcement Fine-Tuning Enables MLLMs Learning Novel Tasks Stably [80.36077974826865]
Post-training algorithms such as Supervised Fine-Tuning (SFT) and Reinforcement Fine-Tuning (RFT) are widely used to adapt multimodal large language models to downstream tasks.<n>We study the behavior of SFT and RFT on an open-source multimodal model, Qwen2.5-VL.<n>Our experiments reveal a sharp trade-off: SFT enables rapid task acquisition but leads to catastrophic forgetting, whereas RFT learns more slowly on novel tasks but maintains prior knowledge.
arXiv Detail & Related papers (2025-06-30T04:15:01Z) - Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections [65.36449542323277]
We present a unified theoretical framework bridgingSupervised Fine-Tuning (SFT) and preference learning in Large Language Model (LLM) post-training.<n>We propose a simple yet effective learning rate reduction approach that yields significant performance improvements.
arXiv Detail & Related papers (2025-06-15T05:42:29Z) - Dual Decomposition of Weights and Singular Value Low Rank Adaptation [9.048461365342204]
We propose DuDe, a novel approach that decomposes weight matrices into magnitude and direction components.<n>Our evaluation demonstrates DuDe's superior performance and robustness, achieving up to 48.35% accuracy on MMLU and 62.53% ($pm$ 1.59) accuracy on GSM8K.
arXiv Detail & Related papers (2025-05-20T13:49:15Z) - R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning [97.49610356913874]
We propose a robust test-time prompt tuning (R-TPT) for vision-language models (VLMs)<n>R-TPT mitigates the impact of adversarial attacks during the inference stage.<n>We introduce a plug-and-play reliability-based weighted ensembling strategy to strengthen the defense.
arXiv Detail & Related papers (2025-04-15T13:49:31Z) - Understanding Fine-tuning in Approximate Unlearning: A Theoretical Perspective [39.958103832214135]
Fine-tuning (FT) methods have become one of the fundamental approaches for approximating unlearning.<n>We present the first theoretical analysis of FT methods for machine unlearning within a linear regression framework.<n>We propose a novel Retention-Based Masking (RBM) strategy that constructs a weight saliency map based on the remaining dataset.
arXiv Detail & Related papers (2024-10-04T18:01:52Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.