Related papers: LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations

LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations

URL: http://arxiv.org/abs/2210.13488v1
Date: Mon, 24 Oct 2022 18:00:04 GMT
Title: LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations
Authors: Zhaoqi Leng, Guowang Li, Chenxi Liu, Ekin Dogus Cubuk, Pei Sun, Tong He, Dragomir Anguelov, Mingxing Tan
Abstract summary: LidarAugment is a search-based data augmentation strategy for 3D object detection. We show LidarAugment can be customized for different model architectures. It consistently improves convolution-based UPillars/StarNet/RSN and transformer-based SWFormer.
Score: 55.45435708426761
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data augmentations are important in training high-performance 3D object detectors for point clouds. Despite recent efforts on designing new data augmentations, perhaps surprisingly, most state-of-the-art 3D detectors only use a few simple data augmentations. In particular, different from 2D image data augmentations, 3D data augmentations need to account for different representations of input data and require being customized for different models, which introduces significant overhead. In this paper, we resort to a search-based approach, and propose LidarAugment, a practical and effective data augmentation strategy for 3D object detection. Unlike previous approaches where all augmentation policies are tuned in an exponentially large search space, we propose to factorize and align the search space of each data augmentation, which cuts down the 20+ hyperparameters to 2, and significantly reduces the search complexity. We show LidarAugment can be customized for different model architectures with different input representations by a simple 2D grid search, and consistently improve both convolution-based UPillars/StarNet/RSN and transformer-based SWFormer. Furthermore, LidarAugment mitigates overfitting and allows us to scale up 3D detectors to much larger capacity. In particular, by combining with latest 3D detectors, our LidarAugment achieves a new state-of-the-art 74.8 mAPH L2 on Waymo Open Dataset.

Related papers

Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting [64.64738535860351]
We present a scalable pipeline that converts single-view images into comprehensive, scale- and appearance-realistic 3D representations.<n>Our method bridges the gap between the vast repository of imagery and the increasing demand for spatial scene understanding.<n>By automatically generating authentic, scale-aware 3D data from images, we significantly reduce data collection costs and open new avenues for advancing spatial intelligence.
arXiv Detail & Related papers (2025-07-24T14:53:26Z)
HyperPointFormer: Multimodal Fusion in 3D Space with Dual-Branch Cross-Attention Transformers [10.24051363232541]
Multimodal remote sensing data, including spectral and lidar or photogrammetry, is crucial for achieving satisfactory land-use / land-cover classification results in urban scenes.<n>We propose a fully 3D-based method that fuses all modalities within the 3D point cloud and employs a dedicated dual-attention Transformer model.<n>Our findings indicate that 3D fusion delivers competitive results compared to 2D methods and offers more flexibility by providing 3D predictions.
arXiv Detail & Related papers (2025-05-29T07:45:19Z)
What Matters in Range View 3D Object Detection [15.147558647138629]
Lidar-based perception pipelines rely on 3D object detection models to interpret complex scenes. We achieve state-of-the-art amongst range-view 3D object detection models without using multiple techniques proposed in past range-view literature.
arXiv Detail & Related papers (2024-07-23T18:42:37Z)
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field. We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models. This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z)
3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space. We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects. Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z)
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection [83.18142309597984]
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. We develop a family of generic multi-modal 3D detection models named DeepFusion, which is more accurate than previous methods.
arXiv Detail & Related papers (2022-03-15T18:46:06Z)
Exploring 2D Data Augmentation for 3D Monocular Object Detection [0.2936007114555107]
Many standard 2D object detection data augmentation techniques do not extend to 3D box. We propose two novel augmentations for monocular 3D detection without a requirement for novel view synthesis.
arXiv Detail & Related papers (2021-04-21T22:43:42Z)
Part-Aware Data Augmentation for 3D Object Detection in Point Cloud [33.59724834383291]
3D label has more sophisticated and rich structural information than the 2D label, so it enables more diverse and effective data augmentation. We propose part-aware data augmentation (PA-AUG) that can better utilize rich information of 3D label. We show that PA-AUG not only increases performance for a given dataset but also is robust to corrupted data.
arXiv Detail & Related papers (2020-07-27T08:47:19Z)
Quantifying Data Augmentation for LiDAR based 3D Object Detection [139.64869289514525]
In this work, we shed light on different data augmentation techniques commonly used in Light Detection and Ranging (LiDAR) based 3D Object Detection. We investigate a variety of global and local augmentation techniques, where global augmentation techniques are applied to the entire point cloud of a scene and local augmentation techniques are only applied to points belonging to individual objects in the scene. Our findings show that both types of data augmentation can lead to performance increases, but it also turns out, that some augmentation techniques, such as individual object translation, for example, can be counterproductive and can hurt the overall performance.
arXiv Detail & Related papers (2020-04-03T16:09:14Z)
Improving 3D Object Detection through Progressive Population Based Augmentation [91.56261177665762]
We present the first attempt to automate the design of data augmentation policies for 3D object detection. We introduce the Progressive Population Based Augmentation (PPBA) algorithm, which learns to optimize augmentation strategies by narrowing down the search space and adopting the best parameters discovered in previous iterations. We find that PPBA may be up to 10x more data efficient than baseline 3D detection models without augmentation, highlighting that 3D detection models may achieve competitive accuracy with far fewer labeled examples.
arXiv Detail & Related papers (2020-04-02T05:57:02Z)
Boundary-Aware Dense Feature Indicator for Single-Stage 3D Object Detection from Point Clouds [32.916690488130506]
We propose a universal module that helps 3D detectors focus on the densest region of the point clouds in a boundary-aware manner. Experiments on KITTI dataset show that DENFI improves the performance of the baseline single-stage detector remarkably.
arXiv Detail & Related papers (2020-04-01T01:21:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.