Related papers: XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis

XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis

URL: http://arxiv.org/abs/2406.18360v3
Date: Wed, 07 May 2025 15:25:04 GMT
Title: XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis
Authors: Hao Li, Chenming Wu, Ming Yuan, Yan Zhang, Chen Zhao, Chunyu Song, Haocheng Feng, Errui Ding, Dingwen Zhang, Jingdong Wang,
Abstract summary: This paper presents a synthetic dataset for novel driving view synthesis evaluation.<n>It includes testing images captured by deviating from the training trajectory by $1-4$ meters.<n>We establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multicamera settings.
Score: 84.23233209017192
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Comprehensive testing of autonomous systems through simulation is essential to ensure the safety of autonomous driving vehicles. This requires the generation of safety-critical scenarios that extend beyond the limitations of real-world data collection, as many of these scenarios are rare or rarely encountered on public roads. However, evaluating most existing novel view synthesis (NVS) methods relies on sporadic sampling of image frames from the training data, comparing the rendered images with ground-truth images. Unfortunately, this evaluation protocol falls short of meeting the actual requirements in closed-loop simulations. Specifically, the true application demands the capability to render novel views that extend beyond the original trajectory (such as cross-lane views), which are challenging to capture in the real world. To address this, this paper presents a synthetic dataset for novel driving view synthesis evaluation, which is specifically designed for autonomous driving simulations. This unique dataset includes testing images captured by deviating from the training trajectory by $1-4$ meters. It comprises six sequences that cover various times and weather conditions. Each sequence contains $450$ training images, $120$ testing images, and their corresponding camera poses and intrinsic parameters. Leveraging this novel dataset, we establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multicamera settings. The experimental findings underscore the significant gap in current approaches, revealing their inadequate ability to fulfill the demanding prerequisites of cross-lane or closed-loop simulation.

Related papers

Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis [5.281171924360707]
We present the first multi-lane dataset registering parallel scans for novel driving view dataset derived from real-world scans. The dataset consists of 25 groups of associated sequences, including 16,000 front-view images, 64,000 surround-view images, and 16,000 LiDAR frames. We evaluate the performance of existing approaches in various testing scenarios at different lanes and distances.
arXiv Detail & Related papers (2025-02-21T18:03:56Z)
Extrapolated Urban View Synthesis Benchmark [53.657271730352214]
Photo simulators are essential for the training and evaluation of vision-centric autonomous vehicles (AVs) At their core is Novel View Synthesis (NVS), a capability that generates diverse unseen viewpoints to accommodate the broad and continuous pose distribution of AVs. Recent advances in radiance fields, such as 3D Gaussian Splatting, achieve photorealistic rendering at real-time speeds and have been widely used in modeling large-scale driving scenes. We will release the data to help advance self-driving and urban robotics simulation technology.
arXiv Detail & Related papers (2024-12-06T18:41:39Z)
DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation [54.02069690134526]
We propose DrivingSphere, a realistic and closed-loop simulation framework. Its core idea is to build 4D world representation and generate real-life and controllable driving scenarios. By providing a dynamic and realistic simulation environment, DrivingSphere enables comprehensive testing and validation of autonomous driving algorithms.
arXiv Detail & Related papers (2024-11-18T03:00:33Z)
Learning autonomous driving from aerial imagery [67.06858775696453]
Photogrammetric simulators allow the synthesis of novel views through the transformation of pre-generated assets into novel views. We use a Neural Radiance Field (NeRF) as an intermediate representation to synthesize novel views from the point of view of a ground vehicle.
arXiv Detail & Related papers (2024-10-18T05:09:07Z)
LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond [37.47964043913622]
We introduce a new dataset LoLI-Street (Low-Light Images of Streets) with 33k paired low-light and well-exposed images from street scenes in developed cities. LoLI-Street dataset also features 1,000 real low-light test images for testing LLIE models under real-life conditions.
arXiv Detail & Related papers (2024-10-13T13:11:56Z)
Querying Labeled Time Series Data with Scenario Programs [0.0]
We propose a formal definition of what constitutes a match between a real-world labeled time series data item and a simulated scenario. We present a definition and algorithm for matching scalable beyond the autonomous vehicles domain.
arXiv Detail & Related papers (2024-06-25T15:15:27Z)
NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking [65.24988062003096]
We present NAVSIM, a framework for benchmarking vision-based driving policies. Our simulation is non-reactive, i.e., the evaluated policy and environment do not influence each other. NAVSIM enabled a new competition held at CVPR 2024, where 143 teams submitted 463 entries, resulting in several new insights.
arXiv Detail & Related papers (2024-06-21T17:59:02Z)
Exploring Generative AI for Sim2Real in Driving Data Synthesis [6.769182994217369]
Driving simulators offer a solution by automatically generating various driving scenarios with corresponding annotations, but the simulation-to-reality (Sim2Real) domain gap remains a challenge. This paper applied three different generative AI methods to leverage semantic label maps from a driving simulator as a bridge for the creation of realistic datasets. Experiments show that although GAN-based methods are adept at generating high-quality images when provided with manually annotated labels, ControlNet produces synthetic datasets with fewer artefacts and more structural fidelity when using simulator-generated labels.
arXiv Detail & Related papers (2024-04-14T01:23:19Z)
Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years. Data-driven simulation for autonomous driving has been a focal point of recent research. We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z)
Cross-Camera Trajectories Help Person Retrieval in a Camera Network [124.65912458467643]
Existing methods often rely on purely visual matching or consider temporal constraints but ignore the spatial information of the camera network. We propose a pedestrian retrieval framework based on cross-camera generation, which integrates both temporal and spatial information. To verify the effectiveness of our method, we construct the first cross-camera pedestrian trajectory dataset.
arXiv Detail & Related papers (2022-04-27T13:10:48Z)
A Multi-Layered Approach for Measuring the Simulation-to-Reality Gap of Radar Perception for Autonomous Driving [0.0]
In order to rely on virtual tests the employed sensor models have to be validated. There exists no sound method to measure this simulation-to-reality gap of radar perception. We have shown the effectiveness of the proposed approach in terms of providing an in-depth sensor model assessment.
arXiv Detail & Related papers (2021-06-15T18:51:39Z)
Online Clustering-based Multi-Camera Vehicle Tracking in Scenarios with overlapping FOVs [2.6365690297272617]
Multi-Target Multi-Camera (MTMC) vehicle tracking is an essential task of visual traffic monitoring. We present a new low-latency online approach for MTMC tracking in scenarios with partially overlapping fields of view.
arXiv Detail & Related papers (2021-02-08T09:55:55Z)
Vehicle Position Estimation with Aerial Imagery from Unmanned Aerial Vehicles [4.555256739812733]
This work describes a process to estimate a precise vehicle position from aerial imagery. The state-of-the-art deep neural network Mask-RCNN is applied for that purpose. A mean accuracy of 20 cm can be achieved with flight altitudes up to 100 m, Full-HD resolution and a frame-by-frame detection.
arXiv Detail & Related papers (2020-04-17T12:29:40Z)
SimAug: Learning Robust Representations from Simulation for Trajectory Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data. We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.