Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
- URL: http://arxiv.org/abs/2505.06699v3
- Date: Sat, 17 May 2025 02:26:21 GMT
- Title: Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
- Authors: Xiyuan Wei, Ming Lin, Fanjiang Ye, Fengguang Song, Liangliang Cao, My T. Thai, Tianbao Yang,
- Abstract summary: This paper formalizes an emerging learning paradigm that uses a trained model as a reference to guide and enhance the training of a target model through strategic data selection or weighting.<n>We provide theoretical insights into why this approach improves generalization and data efficiency compared to training without a reference model.<n>Building on these insights, we introduce a novel method for Contrastive Language-Image Pretraining with a reference model, termed DRRho-CLIP.
- Score: 52.10468229008941
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper formalizes an emerging learning paradigm that uses a trained model as a reference to guide and enhance the training of a target model through strategic data selection or weighting, named $\textbf{model steering}$. While ad-hoc methods have been used in various contexts, including the training of large foundation models, its underlying principles remain insufficiently understood, leading to sub-optimal performance. In this work, we propose a theory-driven framework for model steering called $\textbf{DRRho risk minimization}$, which is rooted in Distributionally Robust Optimization (DRO). Through a generalization analysis, we provide theoretical insights into why this approach improves generalization and data efficiency compared to training without a reference model. To the best of our knowledge, this is the first time such theoretical insights are provided for the new learning paradigm, which significantly enhance our understanding and practice of model steering. Building on these insights and the connection between contrastive learning and DRO, we introduce a novel method for Contrastive Language-Image Pretraining (CLIP) with a reference model, termed DRRho-CLIP. Extensive experiments validate the theoretical insights, reveal a superior scaling law compared to CLIP without a reference model, and demonstrate its strength over existing heuristic approaches.
Related papers
- Approach to Finding a Robust Deep Learning Model [0.28675177318965045]
The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models.<n>We propose a novel approach for determining model robustness using a proposed model selection algorithm designed as a meta-algorithm.<n>Within this framework, we address the influence of training sample size, model weight, and inductive bias on the robustness of deep learning models.
arXiv Detail & Related papers (2025-05-22T20:05:20Z) - SAMBO-RL: Shifts-aware Model-based Offline Reinforcement Learning [9.88109749688605]
Model-based offline reinforcement learning trains policies using pre-collected datasets and learned environment models.<n>This paper offers a comprehensive analysis that disentangles the problem into two fundamental components: model bias and policy shift.<n>We introduce Shifts-aware Model-based Offline Reinforcement Learning (SAMBO-RL), a practical framework that efficiently trains classifiers to approximate SAR for policy optimization.
arXiv Detail & Related papers (2024-08-23T04:25:09Z) - On the Generalization of Preference Learning with DPO [17.420727709895736]
Large language models (LLMs) have demonstrated remarkable capabilities but often struggle to align with human preferences.<n> Preference learning trains models to distinguish between preferred and non-preferred responses based on human feedback.<n>This paper introduces a new theoretical framework to analyze the generalization guarantees of models trained with direct preference optimization (DPO)
arXiv Detail & Related papers (2024-08-06T22:11:00Z) - The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning [31.8260779160424]
We investigate how popular algorithms perform as the learned dynamics model is improved.<n>We propose Reach-Aware Learning (RAVL), a simple and robust method that directly addresses the edge-of-reach problem.
arXiv Detail & Related papers (2024-02-19T20:38:00Z) - Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint [56.74058752955209]
This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF)
We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment.
We propose efficient algorithms with finite-sample theoretical guarantees.
arXiv Detail & Related papers (2023-12-18T18:58:42Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Discriminator Augmented Model-Based Reinforcement Learning [47.094522301093775]
It is common in practice for the learned model to be inaccurate, impairing planning and leading to poor performance.
This paper aims to improve planning with an importance sampling framework that accounts for discrepancy between the true and learned dynamics.
arXiv Detail & Related papers (2021-03-24T06:01:55Z) - COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable.
We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions.
We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.