RSUD20K: A Dataset for Road Scene Understanding In Autonomous Driving
- URL: http://arxiv.org/abs/2401.07322v2
- Date: Fri, 9 Feb 2024 23:06:03 GMT
- Title: RSUD20K: A Dataset for Road Scene Understanding In Autonomous Driving
- Authors: Hasib Zunair, Shakib Khan, and A. Ben Hamza
- Abstract summary: RSUD20K is a new dataset for road scene understanding, comprised of over 20K high-resolution images from the driving perspective on Bangladesh roads.
Our work significantly improves upon previous efforts, providing detailed annotations and increased object complexity.
- Score: 6.372000468173298
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Road scene understanding is crucial in autonomous driving, enabling machines
to perceive the visual environment. However, recent object detectors tailored
for learning on datasets collected from certain geographical locations struggle
to generalize across different locations. In this paper, we present RSUD20K, a
new dataset for road scene understanding, comprised of over 20K high-resolution
images from the driving perspective on Bangladesh roads, and includes 130K
bounding box annotations for 13 objects. This challenging dataset encompasses
diverse road scenes, narrow streets and highways, featuring objects from
different viewpoints and scenes from crowded environments with densely
cluttered objects and various weather conditions. Our work significantly
improves upon previous efforts, providing detailed annotations and increased
object complexity. We thoroughly examine the dataset, benchmarking various
state-of-the-art object detectors and exploring large vision models as image
annotators.
Related papers
- ROAD-Waymo: Action Awareness at Scale for Autonomous Driving [17.531603453254434]
ROAD-Waymo is an extensive dataset for the development and benchmarking of techniques for agent, action, location and event detection in road scenes.
Considerably larger and more challenging than any existing dataset (and encompassing multiple cities), it comes with 198k annotated video frames, 54k agent tubes, 3.9M bounding boxes and a total of 12.4M labels.
arXiv Detail & Related papers (2024-11-03T20:46:50Z) - RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving [62.5830455357187]
In this paper, we construct a multimodal data collection platform based on 3 main types of sensors (Camera, LiDAR and Fisheye)
A large-scale multi-sensor dataset is built, named RoboSense, to facilitate near-field scene understanding.
RoboSense contains more than 133K synchronized data with 1.4M 3D bounding box and IDs in the full $360circ$ view, forming 216K trajectories across 7.6K temporal sequences.
arXiv Detail & Related papers (2024-08-28T03:17:40Z) - WayveScenes101: A Dataset and Benchmark for Novel View Synthesis in Autonomous Driving [4.911903454560829]
WayveScenes101 is a dataset designed to help the community advance the state of the art in novel view synthesis.
The dataset comprises 101 driving scenes across a wide range of environmental conditions and driving scenarios.
arXiv Detail & Related papers (2024-07-11T08:29:45Z) - IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic [35.23523738296173]
We present IDD-X, a large-scale dual-view driving video dataset.
With 697K bounding boxes, 9K important object tracks, and 1-12 objects per video, IDD-X offers comprehensive ego-relative annotations for multiple important road objects.
We also introduce custom-designed deep networks aimed at multiple important object localization and per-object explanation prediction.
arXiv Detail & Related papers (2024-04-12T16:00:03Z) - Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene
Scale and Realism Tradeoffs for ObjectGoal Navigation [70.82403156865057]
We investigate the impact of synthetic 3D scene dataset scale and realism on the task of training embodied agents to find and navigate to objects.
Our experiments show that agents trained on our smaller-scale dataset can match or outperform agents trained on much larger datasets.
arXiv Detail & Related papers (2023-06-20T05:07:23Z) - Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years.
Data-driven simulation for autonomous driving has been a focal point of recent research.
We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z) - Ithaca365: Dataset and Driving Perception under Repeated and Challenging
Weather Conditions [0.0]
We present a new dataset to enable robust autonomous driving via a novel data collection process.
The dataset includes images and point clouds from cameras and LiDAR sensors, along with high-precision GPS/INS.
We demonstrate the uniqueness of this dataset by analyzing the performance of baselines in amodal segmentation of road and objects.
arXiv Detail & Related papers (2022-08-01T22:55:32Z) - Rope3D: TheRoadside Perception Dataset for Autonomous Driving and
Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view.
The dataset consists of 50k images and over 1.5M 3D objects in various scenes.
We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z) - Structured Bird's-Eye-View Traffic Scene Understanding from Onboard
Images [128.881857704338]
We study the problem of extracting a directed graph representing the local road network in BEV coordinates, from a single onboard camera image.
We show that the method can be extended to detect dynamic objects on the BEV plane.
We validate our approach against powerful baselines and show that our network achieves superior performance.
arXiv Detail & Related papers (2021-10-05T12:40:33Z) - METEOR: A Massive Dense & Heterogeneous Behavior Dataset for Autonomous
Driving [42.69638782267657]
We present a new and complex traffic dataset, METEOR, which captures traffic patterns in unstructured scenarios in India.
METEOR consists of more than 1000 one-minute video clips, over 2 million annotated frames with ego-vehicle trajectories, and more than 13 million bounding boxes for surrounding vehicles or traffic agents.
We use our novel dataset to evaluate the performance of object detection and behavior prediction algorithms.
arXiv Detail & Related papers (2021-09-16T01:01:55Z) - Concealed Object Detection [140.98738087261887]
We present the first systematic study on concealed object detection (COD)
COD aims to identify objects that are "perfectly" embedded in their background.
To better understand this task, we collect a large-scale dataset called COD10K.
arXiv Detail & Related papers (2021-02-20T06:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.