ParkPredict+: Multimodal Intent and Motion Prediction for Vehicles in
Parking Lots with CNN and Transformer
- URL: http://arxiv.org/abs/2204.10777v1
- Date: Sun, 17 Apr 2022 01:54:25 GMT
- Title: ParkPredict+: Multimodal Intent and Motion Prediction for Vehicles in
Parking Lots with CNN and Transformer
- Authors: Xu Shen, Matthew Lacayo, Nidhir Guggilla, Francesco Borrelli
- Abstract summary: multimodal intent and trajectory prediction for human-driven vehicles in parking lots is addressed in this paper.
Using models designed with CNN and Transformer networks, we extract temporal-spatial and contextual information from trajectory history and local bird's eye view semantic images.
Our methods outperforms existing models in accuracy, while allowing an arbitrary number of modes.
In addition, we present the first public human driving dataset in parking lot with high resolution and rich traffic scenarios.
- Score: 11.287187018907284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of multimodal intent and trajectory prediction for human-driven
vehicles in parking lots is addressed in this paper. Using models designed with
CNN and Transformer networks, we extract temporal-spatial and contextual
information from trajectory history and local bird's eye view (BEV) semantic
images, and generate predictions about intent distribution and future
trajectory sequences. Our methods outperforms existing models in accuracy,
while allowing an arbitrary number of modes, encoding complex multi-agent
scenarios, and adapting to different parking maps. In addition, we present the
first public human driving dataset in parking lot with high resolution and rich
traffic scenarios for relevant research in this field.
Related papers
- Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years.
Data-driven simulation for autonomous driving has been a focal point of recent research.
We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Multi-Modal Hybrid Architecture for Pedestrian Action Prediction [14.032334569498968]
We propose a novel multi-modal prediction algorithm that incorporates different sources of information captured from the environment to predict future crossing actions of pedestrians.
Using the existing 2D pedestrian behavior benchmarks and a newly annotated 3D driving dataset, we show that our proposed model achieves state-of-the-art performance in pedestrian crossing prediction.
arXiv Detail & Related papers (2020-11-16T15:17:58Z) - Vehicle Trajectory Prediction in Crowded Highway Scenarios Using Bird
Eye View Representations and CNNs [0.0]
This paper describes a novel approach to perform vehicle trajectory predictions employing graphic representations.
The problem is faced as an image to image regression problem training the network to learn the underlying relations between the traffic participants.
The model has been tested in highway scenarios with more than 30 vehicles simultaneously in two opposite traffic flow streams.
arXiv Detail & Related papers (2020-08-26T11:15:49Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Probabilistic Crowd GAN: Multimodal Pedestrian Trajectory Prediction
using a Graph Vehicle-Pedestrian Attention Network [12.070251470948772]
We show how Probabilistic Crowd GAN can output probabilistic multimodal predictions.
We also propose the use of Graph Vehicle-Pedestrian Attention Network (GVAT), which models social interactions.
We demonstrate improvements on the existing state of the art methods for trajectory prediction and illustrate how the true multimodal and uncertain nature of crowd interactions can be directly modelled.
arXiv Detail & Related papers (2020-06-23T11:25:16Z) - AMENet: Attentive Maps Encoder Network for Trajectory Prediction [35.22312783822563]
Trajectory prediction is critical for applications of planning safe future movements.
We propose an end-to-end generative model named Attentive Maps Network (AMENet)
AMENet encodes the agent's motion and interaction information for accurate and realistic multi-path trajectory prediction.
arXiv Detail & Related papers (2020-06-15T10:00:07Z) - ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots [65.33650222396078]
We develop a parking lot environment and collect a dataset of human parking maneuvers.
We compare a multi-modal Long Short-Term Memory (LSTM) prediction model and a Convolution Neural Network LSTM (CNN-LSTM) to a physics-based Extended Kalman Filter (EKF) baseline.
Our results show that 1) intent can be estimated well (roughly 85% top-1 accuracy and nearly 100% top-3 accuracy with the LSTM and CNN-LSTM model); 2) knowledge of the human driver's intended parking spot has a major impact on predicting parking trajectory; and 3) the semantic representation of the environment
arXiv Detail & Related papers (2020-04-21T20:46:32Z) - MCENET: Multi-Context Encoder Network for Homogeneous Agent Trajectory
Prediction in Mixed Traffic [35.22312783822563]
Trajectory prediction in urban mixedtraffic zones is critical for many intelligent transportation systems.
We propose an approach named Multi-Context Network (MCENET) that is trained by encoding both past and future scene context.
In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories.
arXiv Detail & Related papers (2020-02-14T11:04:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.