Large Scale Interactive Motion Forecasting for Autonomous Driving : The
Waymo Open Motion Dataset
- URL: http://arxiv.org/abs/2104.10133v1
- Date: Tue, 20 Apr 2021 17:19:05 GMT
- Title: Large Scale Interactive Motion Forecasting for Autonomous Driving : The
Waymo Open Motion Dataset
- Authors: Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao,
Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles Qi, Yin Zhou, Zoey Yang,
Aurelien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley,
Jonathon Shlens, Dragomir Anguelov
- Abstract summary: With over 100,000 scenes, each 20 seconds long at 10 Hz, our new dataset contains more than 570 hours of unique data over 1750 km of roadways.
We use a high-accuracy 3D auto-labeling system to generate high quality 3D bounding boxes for each road agent.
We introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models.
- Score: 84.3946567650148
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As autonomous driving systems mature, motion forecasting has received
increasing attention as a critical requirement for planning. Of particular
importance are interactive situations such as merges, unprotected turns, etc.,
where predicting individual object motion is not sufficient. Joint predictions
of multiple objects are required for effective route planning. There has been a
critical need for high-quality motion data that is rich in both interactions
and annotation to develop motion planning models. In this work, we introduce
the most diverse interactive motion dataset to our knowledge, and provide
specific labels for interacting objects suitable for developing joint
prediction models. With over 100,000 scenes, each 20 seconds long at 10 Hz, our
new dataset contains more than 570 hours of unique data over 1750 km of
roadways. It was collected by mining for interesting interactions between
vehicles, pedestrians, and cyclists across six cities within the United States.
We use a high-accuracy 3D auto-labeling system to generate high quality 3D
bounding boxes for each road agent, and provide corresponding high definition
3D maps for each scene. Furthermore, we introduce a new set of metrics that
provides a comprehensive evaluation of both single agent and joint agent
interaction motion forecasting models. Finally, we provide strong baseline
models for individual-agent prediction and joint-prediction. We hope that this
new large-scale interactive motion dataset will provide new opportunities for
advancing motion forecasting models.
Related papers
- Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion
Prediction [13.177817435234449]
3D multi-person motion prediction is a challenging task that involves modeling individual behaviors and interactions between people.
We introduce the Multi-Person Interaction Motion (MI-Motion) dataset, which includes skeleton sequences of multiple individuals collected by motion capture systems.
The dataset contains 167k frames of interacting people's skeleton poses and is categorized into 5 different activity scenes.
arXiv Detail & Related papers (2023-06-23T15:38:22Z) - Argoverse 2: Next Generation Datasets for Self-Driving Perception and
Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain.
The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose.
The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z) - CRAT-Pred: Vehicle Trajectory Prediction with Crystal Graph
Convolutional Neural Networks and Multi-Head Self-Attention [10.83642398981694]
CRAT-Pred is a trajectory prediction model that does not rely on map information.
The model achieves state-of-the-art performance with a significantly lower number of model parameters.
In addition to that, we quantitatively show that the self-attention mechanism is able to learn social interactions between vehicles, with the weights representing a measurable interaction score.
arXiv Detail & Related papers (2022-02-09T14:36:36Z) - One Million Scenes for Autonomous Driving: ONCE Dataset [91.94189514073354]
We introduce the ONCE dataset for 3D object detection in the autonomous driving scenario.
The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available.
We reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.
arXiv Detail & Related papers (2021-06-21T12:28:08Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Graph-SIM: A Graph-based Spatiotemporal Interaction Modelling for
Pedestrian Action Prediction [10.580548257913843]
We propose a novel graph-based model for predicting pedestrian crossing action.
We introduce a new dataset that provides 3D bounding box and pedestrian behavioural annotations for the existing nuScenes dataset.
Our approach achieves state-of-the-art performance by improving on various metrics by more than 15% in comparison to existing methods.
arXiv Detail & Related papers (2020-12-03T18:28:27Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - Trajectron++: Dynamically-Feasible Trajectory Forecasting With
Heterogeneous Data [37.176411554794214]
Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation.
We present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents.
We demonstrate its performance on several challenging real-world trajectory forecasting datasets.
arXiv Detail & Related papers (2020-01-09T16:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.