Related papers: ROAD-Waymo: Action Awareness at Scale for Autonomous Driving

ROAD-Waymo: Action Awareness at Scale for Autonomous Driving

URL: http://arxiv.org/abs/2411.01683v2
Date: Fri, 08 Nov 2024 12:50:03 GMT
Title: ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
Authors: Salman Khan, Izzeddin Teeti, Reza Javanmard Alitappeh, Mihaela C. Stoian, Eleonora Giunchiglia, Gurkirt Singh, Andrew Bradley, Fabio Cuzzolin,
Abstract summary: ROAD-Waymo is an extensive dataset for the development and benchmarking of techniques for agent, action, location and event detection in road scenes. Considerably larger and more challenging than any existing dataset (and encompassing multiple cities), it comes with 198k annotated video frames, 54k agent tubes, 3.9M bounding boxes and a total of 12.4M labels.
Score: 17.531603453254434
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autonomous Vehicle (AV) perception systems require more than simply seeing, via e.g., object detection or scene segmentation. They need a holistic understanding of what is happening within the scene for safe interaction with other road users. Few datasets exist for the purpose of developing and training algorithms to comprehend the actions of other road users. This paper presents ROAD-Waymo, an extensive dataset for the development and benchmarking of techniques for agent, action, location and event detection in road scenes, provided as a layer upon the (US) Waymo Open dataset. Considerably larger and more challenging than any existing dataset (and encompassing multiple cities), it comes with 198k annotated video frames, 54k agent tubes, 3.9M bounding boxes and a total of 12.4M labels. The integrity of the dataset has been confirmed and enhanced via a novel annotation pipeline designed for automatically identifying violations of requirements specifically designed for this dataset. As ROAD-Waymo is compatible with the original (UK) ROAD dataset, it provides the opportunity to tackle domain adaptation between real-world road scenarios in different countries within a novel benchmark: ROAD++.

Related papers

Highly Accurate and Diverse Traffic Data: The DeepScenario Open 3D Dataset [25.244956737443527]
We introduce the DeepScenario Open 3D dataset (DSC3D) of 6 degrees of freedom bounding box trajectories acquired through a novel monocular camera drone tracking pipeline. Our dataset includes more than 175,000 trajectories of 14 types of traffic participants and significantly exceeds existing datasets in terms of diversity and scale. We demonstrate its utility across multiple applications including motion prediction, motion planning, scenario mining, and generative reactive traffic agents.
arXiv Detail & Related papers (2025-04-24T08:43:48Z)
DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments [60.69159598130235]
We present a new dataset, DAVE, designed for evaluating perception methods with high representation of Vulnerable Road Users (VRUs) DAVE is a manually annotated dataset encompassing 16 diverse actor categories (spanning animals, humans, vehicles, etc.) and 16 action types (complex and rare cases like cut-ins, zigzag movement, U-turn, etc.) Our experiments show that existing methods suffer degradation in performance when evaluated on DAVE, highlighting its benefit for future video recognition research.
arXiv Detail & Related papers (2024-12-28T06:13:44Z)
IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic [35.23523738296173]
We present IDD-X, a large-scale dual-view driving video dataset. With 697K bounding boxes, 9K important object tracks, and 1-12 objects per video, IDD-X offers comprehensive ego-relative annotations for multiple important road objects. We also introduce custom-designed deep networks aimed at multiple important object localization and per-object explanation prediction.
arXiv Detail & Related papers (2024-04-12T16:00:03Z)
RSUD20K: A Dataset for Road Scene Understanding In Autonomous Driving [6.372000468173298]
RSUD20K is a new dataset for road scene understanding, comprised of over 20K high-resolution images from the driving perspective on Bangladesh roads. Our work significantly improves upon previous efforts, providing detailed annotations and increased object complexity.
arXiv Detail & Related papers (2024-01-14T16:10:42Z)
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view. To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z)
RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving [67.09546127265034]
Road surface reconstruction helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems. We introduce the Road Surface Reconstruction dataset, a real-world, high-resolution, and high-precision dataset collected with a specialized platform in diverse driving conditions. It covers common road types containing approximately 16,000 pairs of stereo images, original point clouds, and ground-truth depth/disparity maps.
arXiv Detail & Related papers (2023-10-03T17:59:32Z)
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain. The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z)
IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes [79.18349050238413]
Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios. An unstructured and complex driving layout found in several developing countries such as India poses a challenge to these models. We build a new dataset, IDD-3D, which consists of multi-modal data from multiple cameras and LiDAR sensors with 12k annotated driving LiDAR frames.
arXiv Detail & Related papers (2022-10-23T23:03:17Z)
Ithaca365: Dataset and Driving Perception under Repeated and Challenging Weather Conditions [0.0]
We present a new dataset to enable robust autonomous driving via a novel data collection process. The dataset includes images and point clouds from cameras and LiDAR sensors, along with high-precision GPS/INS. We demonstrate the uniqueness of this dataset by analyzing the performance of baselines in amodal segmentation of road and objects.
arXiv Detail & Related papers (2022-08-01T22:55:32Z)
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving [117.87070488537334]
We introduce a challenging dataset named CODA that exposes this critical problem of vision-based detectors. The performance of standard object detectors trained on large-scale autonomous driving datasets significantly drops to no more than 12.8% in mAR. We experiment with the state-of-the-art open-world object detector and find that it also fails to reliably identify the novel objects in CODA.
arXiv Detail & Related papers (2022-03-15T08:32:56Z)
One Million Scenes for Autonomous Driving: ONCE Dataset [91.94189514073354]
We introduce the ONCE dataset for 3D object detection in the autonomous driving scenario. The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available. We reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.
arXiv Detail & Related papers (2021-06-21T12:28:08Z)
ROAD: The ROad event Awareness Dataset for Autonomous Driving [16.24547478826027]
ROAD is designed to test an autonomous vehicle's ability to detect road events. It comprises 22 videos, annotated with bounding boxes showing the location in the image plane of each road event. We also provide as baseline a new incremental algorithm for online road event awareness, based on RetinaNet along time.
arXiv Detail & Related papers (2021-02-23T09:48:56Z)
Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes. We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.