3D Data Augmentation for Driving Scenes on Camera
- URL: http://arxiv.org/abs/2303.10340v1
- Date: Sat, 18 Mar 2023 05:51:05 GMT
- Title: 3D Data Augmentation for Driving Scenes on Camera
- Authors: Wenwen Tong, Jiangwei Xie, Tianyu Li, Hanming Deng, Xiangwei Geng,
Ruoyi Zhou, Dingchen Yang, Bo Dai, Lewei Lu, Hongyang Li
- Abstract summary: We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space.
We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects.
Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
- Score: 50.41413053812315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Driving scenes are extremely diverse and complicated that it is impossible to
collect all cases with human effort alone. While data augmentation is an
effective technique to enrich the training data, existing methods for camera
data in autonomous driving applications are confined to the 2D image plane,
which may not optimally increase data diversity in 3D real-world scenarios. To
this end, we propose a 3D data augmentation approach termed Drive-3DAug, aiming
at augmenting the driving scenes on camera in the 3D space. We first utilize
Neural Radiance Field (NeRF) to reconstruct the 3D models of background and
foreground objects. Then, augmented driving scenes can be obtained by placing
the 3D objects with adapted location and orientation at the pre-defined valid
region of backgrounds. As such, the training database could be effectively
scaled up. However, the 3D object modeling is constrained to the image quality
and the limited viewpoints. To overcome these problems, we modify the original
NeRF by introducing a geometric rectified loss and a symmetric-aware training
strategy. We evaluate our method for the camera-only monocular 3D detection
task on the Waymo and nuScences datasets. The proposed data augmentation
approach contributes to a gain of 1.7% and 1.4% in terms of detection accuracy,
on Waymo and nuScences respectively. Furthermore, the constructed 3D models
serve as digital driving assets and could be recycled for different detectors
or other 3D perception tasks.
Related papers
- HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective [11.841338298700421]
We propose a novel 3D object detection framework integrating Spatial Former and Voxel Pooling Former to enhance 2D-to-3D projection based on height estimation.
Experiments were conducted using the Rope3D and DAIR-V2X-I dataset, and the results demonstrated the outperformance of the proposed algorithm in the detection of both vehicles and cyclists.
arXiv Detail & Related papers (2024-10-10T09:37:33Z) - Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field.
We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models.
This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - Aerial Monocular 3D Object Detection [46.26215100532241]
This work proposes a dual-view detection system named DVDET to achieve aerial monocular object detection in both the 2D image space and the 3D physical space.
To address the dataset challenge, we propose a new large-scale simulation dataset named AM3D-Sim, generated by the co-simulation of AirSIM and CARLA, and a new real-world aerial dataset named AM3D-Real, collected by DJI Matrice 300 RTK.
arXiv Detail & Related papers (2022-08-08T08:32:56Z) - ONCE-3DLanes: Building Monocular 3D Lane Detection [41.46466150783367]
We present ONCE-3DLanes, a real-world autonomous driving dataset with lane layout annotation in 3D space.
By exploiting the explicit relationship between point clouds and image pixels, a dataset annotation pipeline is designed to automatically generate high-quality 3D lane locations.
arXiv Detail & Related papers (2022-04-30T16:35:25Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - CenterLoc3D: Monocular 3D Vehicle Localization Network for Roadside
Surveillance Cameras [0.0]
We propose a 3D vehicle localization network CenterLoc3D for roadside monocular cameras.
The proposed method achieves high accuracy and real-time performance.
arXiv Detail & Related papers (2022-03-28T07:47:37Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.