Related papers: Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving

Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving

URL: http://arxiv.org/abs/2409.00858v2
Date: Thu, 5 Sep 2024 08:07:27 GMT
Title: Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving
Authors: Zilin Huang, Zihao Sheng, Sikai Chen,
Abstract summary: Reinforcement Learning with Human Feedback (RLHF) has attracted substantial attention due to its potential to enhance training safety and sampling efficiency. Inspired by the human learning process, we propose Physics-enhanced Reinforcement Learning with Human Feedback (PE-RLHF) PE-RLHF guarantees the learned policy will perform at least as well as the given physics-based policy, even when human feedback quality deteriorates.
Score: 1.5361702135159845
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the field of autonomous driving, developing safe and trustworthy autonomous driving policies remains a significant challenge. Recently, Reinforcement Learning with Human Feedback (RLHF) has attracted substantial attention due to its potential to enhance training safety and sampling efficiency. Nevertheless, existing RLHF-enabled methods often falter when faced with imperfect human demonstrations, potentially leading to training oscillations or even worse performance than rule-based approaches. Inspired by the human learning process, we propose Physics-enhanced Reinforcement Learning with Human Feedback (PE-RLHF). This novel framework synergistically integrates human feedback (e.g., human intervention and demonstration) and physics knowledge (e.g., traffic flow model) into the training loop of reinforcement learning. The key advantage of PE-RLHF is its guarantee that the learned policy will perform at least as well as the given physics-based policy, even when human feedback quality deteriorates, thus ensuring trustworthy safety improvements. PE-RLHF introduces a Physics-enhanced Human-AI (PE-HAI) collaborative paradigm for dynamic action selection between human and physics-based actions, employs a reward-free approach with a proxy value function to capture human preferences, and incorporates a minimal intervention mechanism to reduce the cognitive load on human mentors. Extensive experiments across diverse driving scenarios demonstrate that PE-RLHF significantly outperforms traditional methods, achieving state-of-the-art (SOTA) performance in safety, efficiency, and generalizability, even with varying quality of human feedback. The philosophy behind PE-RLHF not only advances autonomous driving technology but can also offer valuable insights for other safety-critical domains. Demo video and code are available at: \https://zilin-huang.github.io/PE-RLHF-website/

Related papers

Learning from Active Human Involvement through Proxy Value Propagation [44.144964115275]
Learning from active human involvement enables the human subject to actively intervene and demonstrate to the AI agent during training. We propose a new reward-free active human involvement method called Proxy Value propagation for policy optimization. Our method can learn to solve continuous and discrete control tasks with various human control devices, including the challenging task of driving in Grand Theft Auto V.
arXiv Detail & Related papers (2025-02-05T17:07:37Z)
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models [94.39278422567955]
Fine-tuning large language models (LLMs) on human preferences has proven successful in enhancing their capabilities. However, ensuring the safety of LLMs during the fine-tuning remains a critical concern. We propose a supervised learning framework called Bi-Factorial Preference Optimization (BFPO) to address this issue.
arXiv Detail & Related papers (2024-08-27T17:31:21Z)
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention [81.56607128684723]
We introduce MEReQ (Maximum-Entropy Residual-Q Inverse Reinforcement Learning), designed for sample-efficient alignment from human intervention. MereQ infers a residual reward function that captures the discrepancy between the human expert's and the prior policy's underlying reward functions. It then employs Residual Q-Learning (RQL) to align the policy with human preferences using this residual reward function.
arXiv Detail & Related papers (2024-06-24T01:51:09Z)
Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy. The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms. We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z)
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs [49.386699863989335]
Training large language models (LLMs) to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences. In this paper, we analyze RLHF through the lens of reinforcement learning principles to develop an understanding of its fundamentals.
arXiv Detail & Related papers (2024-04-12T15:54:15Z)
SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation [54.97931304488993]
Self-improving robots that interact and improve with experience are key to the real-world deployment of robotic systems. We propose an online learning method, SELFI, that leverages online robot experience to rapidly fine-tune pre-trained control policies. We report improvements in terms of collision avoidance, as well as more socially compliant behavior, measured by a human user study.
arXiv Detail & Related papers (2024-03-01T21:27:03Z)
Stable and Safe Human-aligned Reinforcement Learning through Neural Ordinary Differential Equations [1.5413714916429737]
This paper provides safety and stability definitions for such human-aligned tasks. An algorithm that leverages neural ordinary differential equations (NODEs) to predict human and robot movements is proposed. Simulation results show that the algorithm helps the controlled robot to reach the desired goal state with fewer safety violations.
arXiv Detail & Related papers (2024-01-23T23:50:19Z)
HAIM-DRL: Enhanced Human-in-the-loop Reinforcement Learning for Safe and Efficient Autonomous Driving [2.807187711407621]
We propose an enhanced human-in-the-loop reinforcement learning method, termed the Human as AI mentor-based deep reinforcement learning (HAIM-DRL) framework. We first introduce an innovative learning paradigm that effectively injects human intelligence into AI, termed Human as AI mentor (HAIM) In this paradigm, the human expert serves as a mentor to the AI agent, while the agent could be guided to minimize traffic flow disturbance.
arXiv Detail & Related papers (2024-01-06T08:30:14Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Safe RLHF: Safe Reinforcement Learning from Human Feedback [16.69413517494355]
We propose Safe Reinforcement Learning from Human Feedback (Safe RLHF), a novel algorithm for human value alignment. Safe RLHF explicitly decouples human preferences regarding helpfulness and harmlessness, effectively avoiding the crowdworkers' confusion about the tension. We demonstrate a superior ability to mitigate harmful responses while enhancing model performance.
arXiv Detail & Related papers (2023-10-19T14:22:03Z)
Primitive Skill-based Robot Learning from Human Evaluative Feedback [28.046559859978597]
Reinforcement learning algorithms face challenges when dealing with long-horizon robot manipulation tasks in real-world environments. We propose a novel framework, SEED, which leverages two approaches: reinforcement learning from human feedback (RLHF) and primitive skill-based reinforcement learning. Our results show that SEED significantly outperforms state-of-the-art RL algorithms in sample efficiency and safety.
arXiv Detail & Related papers (2023-07-28T20:48:30Z)
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios [147.16925581385576]
We show how imitation learning combined with reinforcement learning can substantially improve the safety and reliability of driving policies. We train a policy on over 100k miles of urban driving data, and measure its effectiveness in test scenarios grouped by different levels of collision likelihood.
arXiv Detail & Related papers (2022-12-21T23:59:33Z)
Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization [38.21629972247463]
We develop a novel human-in-the-loop learning method called Human-AI Copilot Optimization (HACO) The proposed HACO effectively utilizes the data both from the trial-and-error exploration and human's partial demonstration to train a high-performing agent. experiments show that HACO achieves a substantially high sample efficiency in the safe driving benchmark.
arXiv Detail & Related papers (2022-02-17T06:29:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.