VirtualPainting: Addressing Sparsity with Virtual Points and
Distance-Aware Data Augmentation for 3D Object Detection
- URL: http://arxiv.org/abs/2312.16141v1
- Date: Tue, 26 Dec 2023 18:03:05 GMT
- Title: VirtualPainting: Addressing Sparsity with Virtual Points and
Distance-Aware Data Augmentation for 3D Object Detection
- Authors: Sudip Dhakal, Dominic Carrillo, Deyuan Qu, Michael Nutt, Qing Yang,
Song Fu
- Abstract summary: We present an innovative approach that involves the generation of virtual LiDAR points using camera images.
We also enhance these virtual points with semantic labels obtained from image-based segmentation networks.
Our approach offers a versatile solution that can be seamlessly integrated into various 3D frameworks and 2D semantic segmentation methods.
- Score: 3.5259183508202976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent times, there has been a notable surge in multimodal approaches that
decorates raw LiDAR point clouds with camera-derived features to improve object
detection performance. However, we found that these methods still grapple with
the inherent sparsity of LiDAR point cloud data, primarily because fewer points
are enriched with camera-derived features for sparsely distributed objects. We
present an innovative approach that involves the generation of virtual LiDAR
points using camera images and enhancing these virtual points with semantic
labels obtained from image-based segmentation networks to tackle this issue and
facilitate the detection of sparsely distributed objects, particularly those
that are occluded or distant. Furthermore, we integrate a distance aware data
augmentation (DADA) technique to enhance the models capability to recognize
these sparsely distributed objects by generating specialized training samples.
Our approach offers a versatile solution that can be seamlessly integrated into
various 3D frameworks and 2D semantic segmentation methods, resulting in
significantly improved overall detection accuracy. Evaluation on the KITTI and
nuScenes datasets demonstrates substantial enhancements in both 3D and birds
eye view (BEV) detection benchmarks
Related papers
- Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments [67.83787474506073]
We tackle the limitations of current LiDAR-based 3D object detection systems.
We introduce a universal textscFind n' Propagate approach for 3D OV tasks.
We achieve up to a 3.97-fold increase in Average Precision (AP) for novel object classes.
arXiv Detail & Related papers (2024-03-20T12:51:30Z) - UniMODE: Unified Monocular 3D Object Detection [70.27631528933482]
We build a detector based on the bird's-eye-view (BEV) detection paradigm.
We propose an uneven BEV grid design to handle the convergence instability caused by the challenges.
A unified detector UniMODE is derived, which surpasses the previous state-of-the-art on the challenging Omni3D dataset.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - Towards Generalizable Multi-Camera 3D Object Detection via Perspective
Debiasing [28.874014617259935]
Multi-Camera 3D Object Detection (MC3D-Det) has gained prominence with the advent of bird's-eye view (BEV) approaches.
We propose a novel method that aligns 3D detection with 2D camera plane results, ensuring consistent and accurate detections.
arXiv Detail & Related papers (2023-10-17T15:31:28Z) - Semantics-aware LiDAR-Only Pseudo Point Cloud Generation for 3D Object
Detection [0.7234862895932991]
Recent advances introduced pseudo-LiDAR, i.e., synthetic dense point clouds, using additional modalities such as cameras to enhance 3D object detection.
We present a novel LiDAR-only framework that augments raw scans with dense pseudo point clouds by relying on LiDAR sensors and scene semantics.
arXiv Detail & Related papers (2023-09-16T09:18:47Z) - Towards Domain Generalization for Multi-view 3D Object Detection in
Bird-Eye-View [11.958753088613637]
We first analyze the causes of the domain gap for the MV3D-Det task.
To acquire a robust depth prediction, we propose to decouple the depth estimation from intrinsic parameters of the camera.
We modify the focal length values to create multiple pseudo-domains and construct an adversarial training loss to encourage the feature representation to be more domain-agnostic.
arXiv Detail & Related papers (2023-03-03T02:59:13Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Geometry-aware data augmentation for monocular 3D object detection [18.67567745336633]
This paper focuses on monocular 3D object detection, one of the essential modules in autonomous driving systems.
A key challenge is that the depth recovery problem is ill-posed in monocular data.
We conduct a thorough analysis to reveal how existing methods fail to robustly estimate depth when different geometry shifts occur.
We convert the aforementioned manipulations into four corresponding 3D-aware data augmentation techniques.
arXiv Detail & Related papers (2021-04-12T23:12:48Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.