Related papers: Autonomous Vehicle Controllers From End-to-End Differentiable Simulation

Autonomous Vehicle Controllers From End-to-End Differentiable Simulation

URL: http://arxiv.org/abs/2409.07965v1
Date: Thu, 12 Sep 2024 11:50:06 GMT
Title: Autonomous Vehicle Controllers From End-to-End Differentiable Simulation
Authors: Asen Nachkov, Danda Pani Paudel, Luc Van Gool,
Abstract summary: We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
Score: 60.05963742334746
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current methods to learn controllers for autonomous vehicles (AVs) focus on behavioural cloning. Being trained only on exact historic data, the resulting agents often generalize poorly to novel scenarios. Simulators provide the opportunity to go beyond offline datasets, but they are still treated as complicated black boxes, only used to update the global simulation state. As a result, these RL algorithms are slow, sample-inefficient, and prior-agnostic. In this work, we leverage a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers on the large-scale Waymo Open Motion Dataset. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of the environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We combine this setup with a recurrent architecture that can efficiently propagate temporal information across long simulated trajectories. This APG method allows us to learn robust, accurate, and fast policies, while only requiring widely-available expert trajectories, instead of scarce expert actions. We compare to behavioural cloning and find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.

Related papers

Diverse Controllable Diffusion Policy with Signal Temporal Logic [13.154661571539577]
We leverage Signal Temporal Logic (STL) and Diffusion Models to learn controllable, diverse, and rule-aware policy. In the closed-loop testing, our approach reaches the highest diversity, rule satisfaction rate, and the least collision rate. A case study on human-robot encounter scenarios shows our approach can generate diverse and closed-to-oracle trajectories.
arXiv Detail & Related papers (2025-03-04T18:59:00Z)
Gradient-based Trajectory Optimization with Parallelized Differentiable Traffic Simulation [24.95575815501035]
We present a parallelized differentiable traffic simulator based on the Intelligent Driver Model (IDM) Our vehicle simulator efficiently models vehicle motion, generating trajectories that can be supervised to fit real-world data. We show that we can use the simulator to filter noise in the input trajectories (trajectory filtering), reconstruct dense trajectories from sparse ones (trajectory reconstruction), and predict future trajectories.
arXiv Detail & Related papers (2024-12-21T19:53:38Z)
Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks. We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks. In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z)
CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning [38.63187494867502]
CtRL-Sim is a method that leverages return-conditioned offline reinforcement learning (RL) to efficiently generate reactive and controllable traffic agents. We show that CtRL-Sim can generate realistic safety-critical scenarios while providing fine-grained control over agent behaviours.
arXiv Detail & Related papers (2024-03-29T02:10:19Z)
Learning to Fly in Seconds [7.259696592534715]
We show how curriculum learning and a highly optimized simulator enhance sample complexity and lead to fast training times. Our framework enables Simulation-to-Reality (Sim2Real) transfer for direct control after only 18 seconds of training on a consumer-grade laptop.
arXiv Detail & Related papers (2023-11-22T01:06:45Z)
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training. We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z)
DTC: Deep Tracking Control [16.2850135844455]
We propose a hybrid control architecture that combines the advantages of both worlds to achieve greater robustness, foot-placement accuracy, and terrain generalization. A deep neural network policy is trained in simulation, aiming to track the optimized footholds. We demonstrate superior robustness in the presence of slippery or deformable ground when compared to model-based counterparts.
arXiv Detail & Related papers (2023-09-27T07:57:37Z)
Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents. We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead. Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z)
Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone. Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator. We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z)
TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation. In particular, we leverage an implicit latent variable model to parameterize a joint actor policy. We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.