Pixel State Value Network for Combined Prediction and Planning in
Interactive Environments
- URL: http://arxiv.org/abs/2310.07706v1
- Date: Wed, 11 Oct 2023 17:57:13 GMT
- Title: Pixel State Value Network for Combined Prediction and Planning in
Interactive Environments
- Authors: Sascha Rosbach, Stefan M. Leupold, Simon Gro{\ss}johann and Stefan
Roth
- Abstract summary: This work proposes a deep learning methodology to combine prediction and planning.
A conditional GAN with the U-Net architecture is trained to predict two high-resolution image sequences.
Results demonstrate intuitive behavior in complex situations, such as lane changes amidst conflicting objectives.
- Score: 9.117828575880303
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated vehicles operating in urban environments have to reliably interact
with other traffic participants. Planning algorithms often utilize separate
prediction modules forecasting probabilistic, multi-modal, and interactive
behaviors of objects. Designing prediction and planning as two separate modules
introduces significant challenges, particularly due to the interdependence of
these modules. This work proposes a deep learning methodology to combine
prediction and planning. A conditional GAN with the U-Net architecture is
trained to predict two high-resolution image sequences. The sequences represent
explicit motion predictions, mainly used to train context understanding, and
pixel state values suitable for planning encoding kinematic reachability,
object dynamics, safety, and driving comfort. The model can be trained offline
on target images rendered by a sampling-based model-predictive planner,
leveraging real-world driving data. Our results demonstrate intuitive behavior
in complex situations, such as lane changes amidst conflicting objectives.
Related papers
- DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive.
SparseDrive consists of a symmetric sparse perception module and a parallel motion planner.
For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z) - PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving [57.89801036693292]
PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving) considers the timestep-wise interaction to better integrate prediction and planning.
We design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions.
arXiv Detail & Related papers (2023-11-14T11:53:24Z) - Conditioned Human Trajectory Prediction using Iterative Attention Blocks [70.36888514074022]
We present a simple yet effective pedestrian trajectory prediction model aimed at pedestrians positions prediction in urban-like environments.
Our model is a neural-based architecture that can run several layers of attention blocks and transformers in an iterative sequential fashion.
We show that without explicit introduction of social masks, dynamical models, social pooling layers, or complicated graph-like structures, it is possible to produce on par results with SoTA models.
arXiv Detail & Related papers (2022-06-29T07:49:48Z) - Deep Interactive Motion Prediction and Planning: Playing Games with
Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model.
Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z) - Scene Transformer: A unified multi-task model for behavior prediction
and planning [42.758178896204036]
We formulate a model for predicting the behavior of all agents jointly in real-world driving environments.
Inspired by recent language modeling approaches, we use a masking strategy as the query to our model.
We evaluate our approach on autonomous driving datasets for behavior prediction, and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-06-15T20:20:44Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - Scenario-Transferable Semantic Graph Reasoning for Interaction-Aware
Probabilistic Prediction [29.623692599892365]
Accurately predicting the possible behaviors of traffic participants is an essential capability for autonomous vehicles.
We propose a novel generic representation for various driving environments by taking the advantage of semantics and domain knowledge.
arXiv Detail & Related papers (2020-04-07T00:34:36Z) - Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein
Graph Double-Attention Network [29.289670231364788]
In this paper, we propose a generic generative neural system for multi-agent trajectory prediction.
We also employ an efficient kinematic constraint layer applied to vehicle trajectory prediction.
The proposed system is evaluated on three public benchmark datasets for trajectory prediction.
arXiv Detail & Related papers (2020-02-14T20:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.