XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis
- URL: http://arxiv.org/abs/2406.18360v2
- Date: Thu, 27 Jun 2024 02:11:44 GMT
- Title: XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis
- Authors: Hao Li, Ming Yuan, Yan Zhang, Chenming Wu, Chen Zhao, Chunyu Song, Haocheng Feng, Errui Ding, Dingwen Zhang, Jingdong Wang,
- Abstract summary: This paper presents a novel driving view synthesis dataset and benchmark specifically designed for autonomous driving simulations.
The dataset is unique as it includes testing images captured by deviating from the training trajectory by 1-4 meters.
We establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multi-camera settings.
- Score: 84.23233209017192
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Thoroughly testing autonomy systems is crucial in the pursuit of safe autonomous driving vehicles. It necessitates creating safety-critical scenarios that go beyond what can be safely collected from real-world data, as many of these scenarios occur infrequently on public roads. However, the evaluation of most existing NVS methods relies on sporadic sampling of image frames from the training data, comparing the rendered images with ground truth images using metrics. Unfortunately, this evaluation protocol falls short of meeting the actual requirements in closed-loop simulations. Specifically, the true application demands the capability to render novel views that extend beyond the original trajectory (such as cross-lane views), which are challenging to capture in the real world. To address this, this paper presents a novel driving view synthesis dataset and benchmark specifically designed for autonomous driving simulations. This dataset is unique as it includes testing images captured by deviating from the training trajectory by 1-4 meters. It comprises six sequences encompassing various time and weather conditions. Each sequence contains 450 training images, 150 testing images, and their corresponding camera poses and intrinsic parameters. Leveraging this novel dataset, we establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multi-camera settings. The experimental findings underscore the significant gap that exists in current approaches, revealing their inadequate ability to fulfill the demanding prerequisites of cross-lane or closed-loop simulation. Our dataset is released publicly at the project page: https://3d-aigc.github.io/XLD/.
Related papers
- LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond [37.47964043913622]
We introduce a new dataset LoLI-Street (Low-Light Images of Streets) with 33k paired low-light and well-exposed images from street scenes in developed cities.
LoLI-Street dataset also features 1,000 real low-light test images for testing LLIE models under real-life conditions.
arXiv Detail & Related papers (2024-10-13T13:11:56Z) - Querying Labeled Time Series Data with Scenario Programs [0.0]
We propose a formal definition of what constitutes a match between a real-world labeled time series data item and a simulated scenario.
We present a definition and algorithm for matching scalable beyond the autonomous vehicles domain.
arXiv Detail & Related papers (2024-06-25T15:15:27Z) - NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking [65.24988062003096]
We present NAVSIM, a framework for benchmarking vision-based driving policies.
Our simulation is non-reactive, i.e., the evaluated policy and environment do not influence each other.
NAVSIM enabled a new competition held at CVPR 2024, where 143 teams submitted 463 entries, resulting in several new insights.
arXiv Detail & Related papers (2024-06-21T17:59:02Z) - UniSim: A Neural Closed-Loop Sensor Simulator [76.79818601389992]
We present UniSim, a neural sensor simulator that takes a single recorded log captured by a sensor-equipped vehicle.
UniSim builds neural feature grids to reconstruct both the static background and dynamic actors in the scene.
We incorporate learnable priors for dynamic objects, and leverage a convolutional network to complete unseen regions.
arXiv Detail & Related papers (2023-08-03T17:56:06Z) - Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years.
Data-driven simulation for autonomous driving has been a focal point of recent research.
We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z) - Cross-Camera Trajectories Help Person Retrieval in a Camera Network [124.65912458467643]
Existing methods often rely on purely visual matching or consider temporal constraints but ignore the spatial information of the camera network.
We propose a pedestrian retrieval framework based on cross-camera generation, which integrates both temporal and spatial information.
To verify the effectiveness of our method, we construct the first cross-camera pedestrian trajectory dataset.
arXiv Detail & Related papers (2022-04-27T13:10:48Z) - Online Clustering-based Multi-Camera Vehicle Tracking in Scenarios with
overlapping FOVs [2.6365690297272617]
Multi-Target Multi-Camera (MTMC) vehicle tracking is an essential task of visual traffic monitoring.
We present a new low-latency online approach for MTMC tracking in scenarios with partially overlapping fields of view.
arXiv Detail & Related papers (2021-02-08T09:55:55Z) - Vehicle Position Estimation with Aerial Imagery from Unmanned Aerial
Vehicles [4.555256739812733]
This work describes a process to estimate a precise vehicle position from aerial imagery.
The state-of-the-art deep neural network Mask-RCNN is applied for that purpose.
A mean accuracy of 20 cm can be achieved with flight altitudes up to 100 m, Full-HD resolution and a frame-by-frame detection.
arXiv Detail & Related papers (2020-04-17T12:29:40Z) - SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data.
We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.