Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
- URL: http://arxiv.org/abs/2409.18343v1
- Date: Thu, 26 Sep 2024 23:40:33 GMT
- Title: Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
- Authors: Zhenghao Peng, Wenjie Luo, Yiren Lu, Tianyi Shen, Cole Gulino, Ari Seff, Justin Fu,
- Abstract summary: We improve the reliability of agent behaviors by closed-loop fine-tuning of behavior models with reinforcement learning.
Our method demonstrates improved overall performance, as well as improved targeted metrics such as collision rate.
We present a novel policy evaluation benchmark to directly assess the ability of simulated agents to measure the quality of autonomous vehicle planners.
- Score: 17.27549891731047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major challenge in autonomous vehicle research is modeling agent behaviors, which has critical applications including constructing realistic and reliable simulations for off-board evaluation and forecasting traffic agents motion for onboard planning. While supervised learning has shown success in modeling agents across various domains, these models can suffer from distribution shift when deployed at test-time. In this work, we improve the reliability of agent behaviors by closed-loop fine-tuning of behavior models with reinforcement learning. Our method demonstrates improved overall performance, as well as improved targeted metrics such as collision rate, on the Waymo Open Sim Agents challenge. Additionally, we present a novel policy evaluation benchmark to directly assess the ability of simulated agents to measure the quality of autonomous vehicle planners and demonstrate the effectiveness of our approach on this new benchmark.
Related papers
- Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model [2.9109581496560044]
An RL agent that trains in a certain Markov decision process (MDP) often struggles to perform well in nearly identical MDPs.
We employ the framework of Robust MDPs in a model-based setting and introduce a novel learned transition model.
Our experimental results indicate a notable improvement in policy robustness on high-dimensional MuJoCo control tasks.
arXiv Detail & Related papers (2024-06-14T12:37:08Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Reinforcement Learning with Human Feedback for Realistic Traffic
Simulation [53.85002640149283]
Key element of effective simulation is the incorporation of realistic traffic models that align with human knowledge.
This study identifies two main challenges: capturing the nuances of human preferences on realism and the unification of diverse traffic simulation models.
arXiv Detail & Related papers (2023-09-01T19:29:53Z) - Differential Assessment of Black-Box AI Agents [29.98710357871698]
We propose a novel approach to differentially assess black-box AI agents that have drifted from their previously known models.
We leverage sparse observations of the drifted agent's current behavior and knowledge of its initial model to generate an active querying policy.
Empirical evaluation shows that our approach is much more efficient than re-learning the agent model from scratch.
arXiv Detail & Related papers (2022-03-24T17:48:58Z) - UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning
Leveraging Planning [1.1339580074756188]
Offline reinforcement learning (RL) provides a framework for learning decision-making from offline data.
Self-driving vehicles (SDV) learn a policy, which potentially even outperforms the behavior in the sub-optimal data set.
This motivates the use of model-based offline RL approaches, which leverage planning.
arXiv Detail & Related papers (2021-11-22T10:37:52Z) - Uncertainty-Aware Model-Based Reinforcement Learning with Application to
Autonomous Driving [2.3303341607459687]
We propose a novel uncertainty-aware model-based reinforcement learning framework, and then implement and validate it in autonomous driving.
The framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model.
The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios.
arXiv Detail & Related papers (2021-06-23T06:55:14Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z) - Model-based versus Model-free Deep Reinforcement Learning for Autonomous
Racing Cars [46.64253693115981]
This paper investigates how model-based deep reinforcement learning agents generalize to real-world autonomous-vehicle control-tasks.
We show that model-based agents capable of learning in imagination, substantially outperform model-free agents with respect to performance, sample efficiency, successful task completion, and generalization.
arXiv Detail & Related papers (2021-03-08T17:15:23Z) - On the model-based stochastic value gradient for continuous
reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward.
Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.