Predictive Preference Learning from Human Interventions
- URL: http://arxiv.org/abs/2510.01545v2
- Date: Wed, 15 Oct 2025 23:33:59 GMT
- Title: Predictive Preference Learning from Human Interventions
- Authors: Haoyuan Cai, Zhenghao Peng, Bolei Zhou,
- Abstract summary: We introduce Predictive Preference Learning from Human Interventions (PPL) to inform predictions of future rollouts.<n>PPL bootstraps each human intervention into L future time steps, called the preference horizon, with the assumption that the agent follows the same action and the human makes the same intervention in the preference horizon.<n>By applying preference optimization on these future states, expert corrections are propagated into the safety-critical regions where the agent is expected to explore.
- Score: 37.039055683595414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from human involvement aims to incorporate the human subject to monitor and correct agent behavior errors. Although most interactive imitation learning methods focus on correcting the agent's action at the current state, they do not adjust its actions in future states, which may be potentially more hazardous. To address this, we introduce Predictive Preference Learning from Human Interventions (PPL), which leverages the implicit preference signals contained in human interventions to inform predictions of future rollouts. The key idea of PPL is to bootstrap each human intervention into L future time steps, called the preference horizon, with the assumption that the agent follows the same action and the human makes the same intervention in the preference horizon. By applying preference optimization on these future states, expert corrections are propagated into the safety-critical regions where the agent is expected to explore, significantly improving learning efficiency and reducing human demonstrations needed. We evaluate our approach with experiments on both autonomous driving and robotic manipulation benchmarks and demonstrate its efficiency and generality. Our theoretical analysis further shows that selecting an appropriate preference horizon L balances coverage of risky states with label correctness, thereby bounding the algorithmic optimality gap. Demo and code are available at: https://metadriverse.github.io/ppl
Related papers
- Modeling Others' Minds as Code [11.32494166591141]
We introduce ROTE, a novel algorithm for synthesizing behavioral programs in code.<n>ROTE predicts human and AI behaviors from sparse observations, outperforming competitive baselines.<n>By treating action understanding as a program synthesis problem, ROTE opens a path for AI systems to efficiently and effectively predict human behavior in the real-world.
arXiv Detail & Related papers (2025-09-29T22:56:34Z) - Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment [73.14105098897696]
We propose Representation-Aligned Preference-based Learning (RAPL) to learn visual rewards from significantly less human preference feedback.<n>RAPL focuses on fine-tuning pre-trained vision encoders to align with the end-user's visual representation and then constructs a dense visual reward via feature matching.<n>We show that RAPL can learn rewards aligned with human preferences, more efficiently uses preference data, and generalizes across robot embodiments.
arXiv Detail & Related papers (2024-12-06T08:04:02Z) - Understanding the Learning Dynamics of Alignment with Human Feedback [17.420727709895736]
This paper provides an attempt to theoretically analyze the learning dynamics of human preference alignment.
We show how the distribution of preference datasets influences the rate of model updates and provide rigorous guarantees on the training accuracy.
arXiv Detail & Related papers (2024-03-27T16:39:28Z) - REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world.<n>Recent methods aim to mitigate misalignment by learning reward functions from human preferences.<n>We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Active Uncertainty Learning for Human-Robot Interaction: An Implicit
Dual Control Approach [5.05828899601167]
We present an algorithmic approach to enable uncertainty learning for human-in-the-loop motion planning based on the implicit dual control paradigm.
Our approach relies on sampling-based approximation of dynamic programming model predictive control problem.
The resulting policy is shown to preserve the dual control effect for generic human predictive models with both continuous and categorical uncertainty.
arXiv Detail & Related papers (2022-02-15T20:40:06Z) - Probabilistic Human Motion Prediction via A Bayesian Neural Network [71.16277790708529]
We propose a probabilistic model for human motion prediction in this paper.
Our model could generate several future motions when given an observed motion sequence.
We extensively validate our approach on a large scale benchmark dataset Human3.6m.
arXiv Detail & Related papers (2021-07-14T09:05:33Z) - On complementing end-to-end human motion predictors with planning [31.025766804649464]
High capacity end-to-end approaches for human motion prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events.
Planning-based prediction, on the other hand, can reliably output decent-but-not-great predictions.
arXiv Detail & Related papers (2021-03-09T19:02:45Z) - Weak Human Preference Supervision For Deep Reinforcement Learning [48.03929962249475]
The current reward learning from human preferences could be used to resolve complex reinforcement learning (RL) tasks without access to a reward function.
We propose a weak human preference supervision framework, for which we developed a human preference scaling model.
Our established human-demonstration estimator requires human feedback only for less than 0.01% of the agent's interactions with the environment.
arXiv Detail & Related papers (2020-07-25T10:37:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.