RAPiD: Real-time Deterministic Trajectory Planning via Diffusion Behavior Priors for Safe and Efficient Autonomous Driving
- URL: http://arxiv.org/abs/2602.07339v1
- Date: Sat, 07 Feb 2026 03:44:50 GMT
- Title: RAPiD: Real-time Deterministic Trajectory Planning via Diffusion Behavior Priors for Safe and Efficient Autonomous Driving
- Authors: Ruturaj Reddy, Hrishav Bakul Barua, Junn Yong Loo, Thanh Thi Nguyen, Ganesh Krishnasamy,
- Abstract summary: RAPiD is a deterministic policy extraction framework that distills a pretrained diffusion-based planner into an efficient policy.<n>To promote safety and passenger comfort, the policy is optimized using a critic trained to imitate a predictive driver controller.
- Score: 5.030754278104693
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based trajectory planners have demonstrated strong capability for modeling the multimodal nature of human driving behavior, but their reliance on iterative stochastic sampling poses critical challenges for real-time, safety-critical deployment. In this work, we present RAPiD, a deterministic policy extraction framework that distills a pretrained diffusion-based planner into an efficient policy while eliminating diffusion sampling. Using score-regularized policy optimization, we leverage the score function of a pre-trained diffusion planner as a behavior prior to regularize policy learning. To promote safety and passenger comfort, the policy is optimized using a critic trained to imitate a predictive driver controller, providing dense, safety-focused supervision beyond conventional imitation learning. Evaluations demonstrate that RAPiD achieves competitive performance on closed-loop nuPlan scenarios with an 8x speedup over diffusion baselines, while achieving state-of-the-art generalization among learning-based planners on the interPlan benchmark. The official website of this work is: https://github.com/ruturajreddy/RAPiD.
Related papers
- TD-JEPA: Latent-predictive Representations for Zero-Shot Reinforcement Learning [63.73629127832652]
We introduce TD-JEPA, which leverages TD-based latent-predictive representations into unsupervised RL.<n> TD-JEPA trains explicit state and task encoders, a policy-conditioned multi-step predictor, and a set of parameterized policies directly in latent space.<n> Empirically, TD-JEPA matches or outperforms state-of-the-art baselines on locomotion, navigation, and manipulation tasks across 13 datasets.
arXiv Detail & Related papers (2025-10-01T10:21:18Z) - DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving [31.336758241051374]
DriveDPO is a Safety Direct Preference Optimization Policy Learning framework.<n>We distill a unified policy distribution from human imitation similarity and rule-based safety scores for direct policy optimization.<n>Experiments on the NAVSIM benchmark demonstrate that DriveDPO achieves a new state-of-the-art PDMS of 90.0.
arXiv Detail & Related papers (2025-09-22T16:01:11Z) - Drive As You Like: Strategy-Level Motion Planning Based on A Multi-Head Diffusion Model [7.3078271605135114]
We propose a diffusion-based multi-head trajectory planner(M-diffusion planner)<n>During the early training stage, all output heads share weights to learn to generate high-quality trajectories.<n>We incorporate a large language model (LLM) to guide strategy selection, enabling dynamic, instruction-aware planning.
arXiv Detail & Related papers (2025-08-23T08:33:11Z) - Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling [74.41886258801209]
We propose a two-stage trajectory planning framework that decouples principle alignment from behavior learning.<n>Plan-R1 significantly improves planning safety and feasibility, achieving state-of-the-art performance.
arXiv Detail & Related papers (2025-05-23T09:22:19Z) - Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift.<n>Current approaches typically address this issue through online sampling from the target policy.<n>We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z) - Diffusion-Based Planning for Autonomous Driving with Flexible Guidance [19.204115959760788]
We propose a novel transformer-based Diffusion Planner for closed-loop planning.<n>Our model supports joint modeling of both prediction and planning tasks.<n>It achieves state-of-the-art closed-loop performance with robust transferability in diverse driving styles.
arXiv Detail & Related papers (2025-01-26T15:49:50Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - Integration of Reinforcement Learning Based Behavior Planning With
Sampling Based Motion Planning for Automated Driving [0.5801044612920815]
We propose a method to employ a trained deep reinforcement learning policy for dedicated high-level behavior planning.
To the best of our knowledge, this work is the first to apply deep reinforcement learning in this manner.
arXiv Detail & Related papers (2023-04-17T13:49:55Z) - Diffusion Policies as an Expressive Policy Class for Offline
Reinforcement Learning [70.20191211010847]
Offline reinforcement learning (RL) aims to learn an optimal policy using a previously collected static dataset.
We introduce Diffusion Q-learning (Diffusion-QL) that utilizes a conditional diffusion model to represent the policy.
We show that our method can achieve state-of-the-art performance on the majority of the D4RL benchmark tasks.
arXiv Detail & Related papers (2022-08-12T09:54:11Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.