Related papers: ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving

ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving

URL: http://arxiv.org/abs/2508.13977v1
Date: Tue, 19 Aug 2025 16:13:49 GMT
Title: ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving
Authors: Xianda Guo, Ruijun Zhang, Yiqun Duan, Ruilin Wang, Keyuan Zhou, Wenzhao Zheng, Wenke Huang, Gangwei Xu, Mike Horton, Yuan Si, Hao Zhao, Long Chen,
Abstract summary: We introduce a large-scale, diverse, frame-wise continuous dataset for depth estimation in dynamic outdoor driving environments.<n>Compared to existing datasets, ours presents greater diversity in driving scenarios and lower depth density, creating new challenges for generalization.
Score: 16.84661057744478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth estimation is a fundamental task for 3D scene understanding in autonomous driving, robotics, and augmented reality. Existing depth datasets, such as KITTI, nuScenes, and DDAD, have advanced the field but suffer from limitations in diversity and scalability. As benchmark performance on these datasets approaches saturation, there is an increasing need for a new generation of large-scale, diverse, and cost-efficient datasets to support the era of foundation models and multi-modal learning. To address these challenges, we introduce a large-scale, diverse, frame-wise continuous dataset for depth estimation in dynamic outdoor driving environments, comprising 20K video frames to evaluate existing methods. Our lightweight acquisition pipeline ensures broad scene coverage at low cost, while sparse yet statistically sufficient ground truth enables robust training. Compared to existing datasets, ours presents greater diversity in driving scenarios and lower depth density, creating new challenges for generalization. Benchmark experiments with standard monocular depth estimation models validate the dataset's utility and highlight substantial performance gaps in challenging conditions, establishing a new platform for advancing depth estimation research.

Related papers

Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method [54.461213497603154]
Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities.<n>Nuplan-Occ is the largest occupancy dataset to date, constructed from the widely used Nuplan benchmark.<n>We develop a unified framework that jointly synthesizes high-quality occupancy, multi-view videos, and LiDAR point clouds.
arXiv Detail & Related papers (2025-10-27T03:52:45Z)
UnLoc: Leveraging Depth Uncertainties for Floorplan Localization [80.55849461031879]
UnLoc is an efficient data-driven solution for sequential camera localization within floorplans.<n>We introduce a novel probabilistic model that incorporates uncertainty estimation, modeling depth predictions as explicit probability distributions.<n>We evaluate UnLoc on large-scale synthetic and real-world datasets, demonstrating significant improvements in terms of accuracy and robustness.
arXiv Detail & Related papers (2025-09-14T14:45:43Z)
Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation [75.30238170051291]
Depth estimation is a fundamental task in 3D computer vision, crucial for applications such as 3D reconstruction, free-viewpoint rendering, robotics, autonomous driving, and AR/VR technologies.<n>Traditional methods relying on hardware sensors like LiDAR are often limited by high costs, low resolution, and environmental sensitivity, limiting their applicability in real-world scenarios.<n>Recent advances in vision-based methods offer a promising alternative, yet they face challenges in generalization and stability due to either the low-capacity model architectures or the reliance on domain-specific and small-scale datasets.
arXiv Detail & Related papers (2025-07-15T17:59:59Z)
Depth as Points: Center Point-based Depth Estimation [25.930620717806914]
We develop a method for creating task- and scenario-specific datasets in a short time.<n>We construct the virtual depth estimation dataset VirDepth, a large-scale, multi-task autonomous driving dataset.<n>We also propose CenterDepth, a lightweight architecture for monocular depth estimation.
arXiv Detail & Related papers (2025-04-26T03:04:05Z)
PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments [73.80718037070773]
We present the multi-modal Pedestrian-Focused Scene dataset, rigorously annotated in semi-structured scenes with the format of nuScenes.<n>We also propose a novel Hybrid Multi-Scale Fusion Network (HMFN) to detect pedestrians in densely populated and occluded scenarios.
arXiv Detail & Related papers (2025-02-21T09:57:53Z)
Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic Encoding [1.0445560141983634]
We propose a novel image-based semantic embedding that extracts contextual information directly from visual features.<n>Our method achieves performance comparable to state-of-the-art models while addressing the shortcomings of CLIP embeddings in handling outdoor scenes.
arXiv Detail & Related papers (2025-02-01T15:37:22Z)
Real-time Multi-view Omnidirectional Depth Estimation for Real Scenarios based on Teacher-Student Learning with Unlabeled Data [13.107135855680992]
We propose a real-time omnidirectional depth estimation method for edge computing platforms named Rt- OmniMVS.<n>To achieve high accuracy, robustness, and generalization in real-world environments, we introduce a teacher-student learning strategy.<n>We also propose HexaMODE, an omnidirectional depth sensing system based on multi-view fisheye cameras and edge device.
arXiv Detail & Related papers (2024-09-12T08:44:35Z)
UdeerLID+: Integrating LiDAR, Image, and Relative Depth with Semi-Supervised [12.440461420762265]
Road segmentation is a critical task for autonomous driving systems. Our work introduces an innovative approach that integrates LiDAR point cloud data, visual image, and relative depth maps. One of the primary challenges is the scarcity of large-scale, accurately labeled datasets.
arXiv Detail & Related papers (2024-09-10T03:57:30Z)
PLT-D3: A High-fidelity Dynamic Driving Simulation Dataset for Stereo Depth and Scene Flow [0.0]
This paper introduces Dynamic-weather Driving dataset; a high-fidelity stereo depth and scene flow ground truth data generated using Engine 5. In particular, this dataset includes synchronized high-resolution stereo image sequences that replicate a wide array of dynamic weather scenarios. Benchmarks have been established for several critical autonomous driving tasks using Unreal-D3 to measure and enhance the performance of state-of-the-art models.
arXiv Detail & Related papers (2024-06-11T19:21:46Z)
DINO-SD: Champion Solution for ICRA 2024 RoboDepth Challenge [54.71866583204417]
In this report, we introduce the DINO-SD, a novel surround-view depth estimation model. Our DINO-SD does not need additional data and has strong robustness. Our DINO-SD get the best performance in the track4 of ICRA 2024 RoboDepth Challenge.
arXiv Detail & Related papers (2024-05-27T12:21:31Z)
RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving [67.09546127265034]
Road surface reconstruction helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems. We introduce the Road Surface Reconstruction dataset, a real-world, high-resolution, and high-precision dataset collected with a specialized platform in diverse driving conditions. It covers common road types containing approximately 16,000 pairs of stereo images, original point clouds, and ground-truth depth/disparity maps.
arXiv Detail & Related papers (2023-10-03T17:59:32Z)
LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning. However, the promising results achieved on current public datasets may not be applicable to practical scenarios. We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z)
RELLIS-3D Dataset: Data, Benchmarks and Analysis [16.803548871633957]
RELLIS-3D is a multimodal dataset collected in an off-road environment. The data was collected on the Rellis Campus of Texas A&M University.
arXiv Detail & Related papers (2020-11-17T18:28:01Z)
Exploring the Impacts from Datasets to Monocular Depth Estimation (MDE) Models with MineNavi [5.689127984415125]
Current computer vision tasks based on deep learning require a huge amount of data with annotations for model training or testing. In practice, manual labeling for dense estimation tasks is very difficult or even impossible, and the scenes of the dataset are often restricted to a small range. We propose a synthetic dataset generation method to obtain the expandable dataset without burdensome manual workforce.
arXiv Detail & Related papers (2020-08-19T14:03:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.