Finetuning Generative Trajectory Model with Reinforcement Learning from Human Feedback
- URL: http://arxiv.org/abs/2503.10434v1
- Date: Thu, 13 Mar 2025 14:56:17 GMT
- Title: Finetuning Generative Trajectory Model with Reinforcement Learning from Human Feedback
- Authors: Derun Li, Jianwei Ren, Yue Wang, Xin Wen, Pengxiang Li, Leimeng Xu, Kun Zhan, Zhongpu Xia, Peng Jia, Xianpeng Lang, Ningyi Xu, Hang Zhao,
- Abstract summary: We introduce TrajHF, a human feedback-driven finetuning framework for generative trajectory models.<n>TrajHF refines multi-modal trajectory generation beyond conventional imitation learning.<n>It achieves PDMS of 93.95 on NavSim benchmark, significantly exceeding other methods.
- Score: 33.09982089166203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating human-like and adaptive trajectories is essential for autonomous driving in dynamic environments. While generative models have shown promise in synthesizing feasible trajectories, they often fail to capture the nuanced variability of human driving styles due to dataset biases and distributional shifts. To address this, we introduce TrajHF, a human feedback-driven finetuning framework for generative trajectory models, designed to align motion planning with diverse driving preferences. TrajHF incorporates multi-conditional denoiser and reinforcement learning with human feedback to refine multi-modal trajectory generation beyond conventional imitation learning. This enables better alignment with human driving preferences while maintaining safety and feasibility constraints. TrajHF achieves PDMS of 93.95 on NavSim benchmark, significantly exceeding other methods. TrajHF sets a new paradigm for personalized and adaptable trajectory generation in autonomous driving.
Related papers
- DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower.
We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events.
We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z) - MobilityGPT: Enhanced Human Mobility Modeling with a GPT model [12.01839817432357]
We reformat human mobility modeling as an autoregressive generation task to address these issues.
We propose a geospatially-aware generative model, MobilityGPT, to ensure its controllable generation.
Experiments on real-world datasets demonstrate MobilityGPT's superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-02-05T18:22:21Z) - RACER: Rational Artificial Intelligence Car-following-model Enhanced by
Reality [51.244807332133696]
This paper introduces RACER, a cutting-edge deep learning car-following model to predict Adaptive Cruise Control (ACC) driving behavior.
Unlike conventional models, RACER effectively integrates Rational Driving Constraints (RDCs), crucial tenets of actual driving.
RACER excels across key metrics, such as acceleration, velocity, and spacing, registering zero violations.
arXiv Detail & Related papers (2023-12-12T06:21:30Z) - Integrating Higher-Order Dynamics and Roadway-Compliance into
Constrained ILQR-based Trajectory Planning for Autonomous Vehicles [3.200238632208686]
Trajectory planning aims to produce a globally optimal route for Autonomous Passenger Vehicles.
Existing implementations utilizing the vehicle bicycle kinematic model may not guarantee controllable trajectories.
We augment this model by higher-order terms, including the first and second-order derivatives of curvature and longitudinal jerk.
arXiv Detail & Related papers (2023-09-25T22:30:18Z) - Interaction-Aware Personalized Vehicle Trajectory Prediction Using
Temporal Graph Neural Networks [8.209194305630229]
Existing methods mainly rely on generic trajectory predictions from large datasets.
We propose an approach for interaction-aware personalized vehicle trajectory prediction that incorporates temporal graph neural networks.
arXiv Detail & Related papers (2023-08-14T20:20:26Z) - Continuous Trajectory Generation Based on Two-Stage GAN [50.55181727145379]
We propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network.
Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior.
For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator.
arXiv Detail & Related papers (2023-01-16T09:54:02Z) - TrajGen: Generating Realistic and Diverse Trajectories with Reactive and
Feasible Agent Behaviors for Autonomous Driving [19.06020265777298]
Existing simulators rely on system-based behavior models for background vehicles, which cannot capture the complex interactive behaviors in real-world scenarios.
We propose TrajGen, a two-stage trajectory generation framework, which can capture more realistic behaviors directly from human demonstration.
In addition, we develop a data-driven simulator I-Sim that can be used to train reinforcement learning models in parallel based on naturalistic driving data.
arXiv Detail & Related papers (2022-03-31T04:48:29Z) - Formulation and validation of a car-following model based on deep
reinforcement learning [0.0]
We propose and validate a novel car following model based on deep reinforcement learning.
Our model is trained to maximize externally given reward functions for the free and car-following regimes.
The parameters of these reward functions resemble that of traditional models such as the Intelligent Driver Model.
arXiv Detail & Related papers (2021-09-29T08:27:12Z) - Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents.
We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings.
We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z) - Path Planning Followed by Kinodynamic Smoothing for Multirotor Aerial
Vehicles (MAVs) [61.94975011711275]
We propose a geometrically based motion planning technique textquotedblleft RRT*textquotedblright; for this purpose.
In the proposed technique, we modified original RRT* introducing an adaptive search space and a steering function.
We have tested the proposed technique in various simulated environments.
arXiv Detail & Related papers (2020-08-29T09:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.