Self-Improving Model Steering
- URL: http://arxiv.org/abs/2507.08967v1
- Date: Fri, 11 Jul 2025 18:52:32 GMT
- Title: Self-Improving Model Steering
- Authors: Rongyi Zhu, Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Ting Wang,
- Abstract summary: We present SIMS, the first self-improving model-steering framework that operates without relying on external supervision.<n>At its core, SIMS autonomously generates and refines contrastive samples through iterative self-improvement cycles.<n>We show that SIMS substantially outperforms existing methods in steering effectiveness and adaptability.
- Score: 13.424901485601994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model steering represents a powerful technique that dynamically aligns large language models (LLMs) with human preferences during inference. However, conventional model-steering methods rely heavily on externally annotated data, not only limiting their adaptability to varying contexts but also tethering their effectiveness to annotation quality. In this paper, we present SIMS, the first self-improving model-steering framework that operates without relying on external supervision. At its core, SIMS autonomously generates and refines contrastive samples through iterative self-improvement cycles, enabling adaptive, context-specific steering. Additionally, SIMS employs novel strategies, including prompt ranking and contrast sampling, to further enhance steering efficacy. Extensive evaluation across diverse LLMs and benchmarks demonstrates that SIMS substantially outperforms existing methods in steering effectiveness and adaptability, highlighting self-improving model steering as a promising direction for future research on inference-time LLM alignment.
Related papers
- Calibration and Evaluation of Car-Following Models for Autonomous Shuttles Using a Novel Multi-Criteria Framework [4.8342038441006805]
Development of dedicated car-following models for autonomous shuttles is critical to understanding their traffic impacts.<n>More advanced machine learning techniques have not yet been applied to AS trajectories.<n>There is a lack of a unified framework for systematically evaluating and comparing the performance of car-following models.
arXiv Detail & Related papers (2026-02-12T03:19:44Z) - SPACeR: Self-Play Anchoring with Centralized Reference Models [50.55045557371374]
Sim agent policies are realistic, human-like, fast, and scalable in multi-agent settings.<n>Recent progress in imitation learning with large diffusion-based or tokenized models has shown that behaviors can be captured directly from human driving data.<n>We propose SPACeR, a framework that leverages a pretrained tokenized autoregressive motion model as a central reference policy.
arXiv Detail & Related papers (2025-10-20T19:53:02Z) - Do LLM Modules Generalize? A Study on Motion Generation for Autonomous Driving [15.903491909277745]
We present a comprehensive evaluation of five key LLM modules.<n>We demonstrate that, when appropriately adapted, these modules can significantly improve performance for autonomous driving motion generation.<n>In addition, we identify which techniques can be effectively transferred, analyze the potential reasons for the failure of others, and discuss the specific adaptations needed for autonomous driving scenarios.
arXiv Detail & Related papers (2025-09-02T19:02:49Z) - Revealing the Challenges of Sim-to-Real Transfer in Model-Based Reinforcement Learning via Latent Space Modeling [31.74241286023207]
Reinforcement learning (RL) is playing an increasingly important role in fields such as robotic control and autonomous driving.<n>The gap between simulation and the real environment remains a major obstacle to the practical deployment of RL.<n>We propose a latent space based approach to analyze the impact of simulation on real-world policy improvement.
arXiv Detail & Related papers (2025-06-15T06:02:42Z) - Beyond Templates: Dynamic Adaptation of Reasoning Demonstrations via Feasibility-Aware Exploration [15.711365331854614]
We introduce Dynamic Adaptation of Reasoning Trajectories (DART), a novel data adaptation framework.<n>Instead of uniformly imitating expert steps, DART employs a selective imitation strategy guided by step-wise adaptability estimation.<n>We validate DART across multiple reasoning benchmarks and model scales, demonstrating that it significantly improves generalization and data efficiency.
arXiv Detail & Related papers (2025-05-27T04:08:11Z) - Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z) - Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving [17.27549891731047]
We improve the reliability of agent behaviors by closed-loop fine-tuning of behavior models with reinforcement learning.
Our method demonstrates improved overall performance, as well as improved targeted metrics such as collision rate.
We present a novel policy evaluation benchmark to directly assess the ability of simulated agents to measure the quality of autonomous vehicle planners.
arXiv Detail & Related papers (2024-09-26T23:40:33Z) - MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower.
We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events.
We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z) - Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement [102.22911097049953]
Large vision-language models (LVLMs) have achieved impressive results in visual question-answering and reasoning tasks.<n>Existing methods often depend on external models or data, leading to uncontrollable and unstable alignment results.<n>We propose SIMA, a self-improvement framework that enhances visual and language modality alignment without external dependencies.
arXiv Detail & Related papers (2024-05-24T23:09:27Z) - Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
We look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving.
Despite advances in models like GPT-4o, their performance in complex driving environments remains largely unexplored.
arXiv Detail & Related papers (2024-05-09T17:52:42Z) - Bridging the Sim-to-Real Gap with Bayesian Inference [53.61496586090384]
We present SIM-FSVGD for learning robot dynamics from data.
We use low-fidelity physical priors to regularize the training of neural network models.
We demonstrate the effectiveness of SIM-FSVGD in bridging the sim-to-real gap on a high-performance RC racecar system.
arXiv Detail & Related papers (2024-03-25T11:29:32Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Uncertainty-Aware Model-Based Reinforcement Learning with Application to
Autonomous Driving [2.3303341607459687]
We propose a novel uncertainty-aware model-based reinforcement learning framework, and then implement and validate it in autonomous driving.
The framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model.
The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios.
arXiv Detail & Related papers (2021-06-23T06:55:14Z) - Objective-aware Traffic Simulation via Inverse Reinforcement Learning [31.26257563160961]
We formulate traffic simulation as an inverse reinforcement learning problem.
We propose a parameter sharing adversarial inverse reinforcement learning model for dynamics-robust simulation learning.
Our proposed model is able to imitate a vehicle's trajectories in the real world while simultaneously recovering the reward function.
arXiv Detail & Related papers (2021-05-20T07:26:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.