Text3DAug -- Prompted Instance Augmentation for LiDAR Perception
- URL: http://arxiv.org/abs/2408.14253v2
- Date: Tue, 27 Aug 2024 10:50:13 GMT
- Title: Text3DAug -- Prompted Instance Augmentation for LiDAR Perception
- Authors: Laurenz Reichardt, Luca Uhr, Oliver Wasenmüller,
- Abstract summary: LiDAR data of urban scenarios poses unique challenges, such as heterogeneous characteristics and inherent class imbalance.
We propose Text3DAug, a novel approach leveraging generative models for instance augmentation.
Text3DAug does not depend on labeled data and is the first of its kind to generate instances and annotations from text.
- Score: 1.1633929083694388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LiDAR data of urban scenarios poses unique challenges, such as heterogeneous characteristics and inherent class imbalance. Therefore, large-scale datasets are necessary to apply deep learning methods. Instance augmentation has emerged as an efficient method to increase dataset diversity. However, current methods require the time-consuming curation of 3D models or costly manual data annotation. To overcome these limitations, we propose Text3DAug, a novel approach leveraging generative models for instance augmentation. Text3DAug does not depend on labeled data and is the first of its kind to generate instances and annotations from text. This allows for a fully automated pipeline, eliminating the need for manual effort in practical applications. Additionally, Text3DAug is sensor agnostic and can be applied regardless of the LiDAR sensor used. Comprehensive experimental analysis on LiDAR segmentation, detection and novel class discovery demonstrates that Text3DAug is effective in supplementing existing methods or as a standalone method, performing on par or better than established methods, however while overcoming their specific drawbacks. The code is publicly available.
Related papers
- TeFF: Tracking-enhanced Forgetting-free Few-shot 3D LiDAR Semantic Segmentation [10.628870775939161]
This paper addresses the limitations of current few-shot semantic segmentation by exploiting the temporal continuity of LiDAR data.
We employ a tracking model to generate pseudo-ground-truths from a sequence of LiDAR frames, enhancing the dataset's ability to learn on novel classes.
We incorporate LoRA, a technique that reduces the number of trainable parameters, thereby preserving the model's performance on base classes while improving its adaptability to novel classes.
arXiv Detail & Related papers (2024-08-28T09:18:36Z) - Refining the ONCE Benchmark with Hyperparameter Tuning [45.55545585587993]
This work focuses on the evaluation of semi-supervised learning approaches for point cloud data.
Data annotation is of paramount importance in the context of LiDAR applications.
We show that improvements from previous semi-supervised methods may not be as profound as previously thought.
arXiv Detail & Related papers (2023-11-10T13:39:07Z) - DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection [72.25697820290502]
This work introduces a straightforward and efficient strategy to identify potential novel classes through zero-shot classification.
We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.
Empirical evaluations on three datasets, including LVIS, V3Det, and COCO, demonstrate significant improvements over the baseline performance.
arXiv Detail & Related papers (2023-10-02T17:52:24Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - 360$^\circ$ from a Single Camera: A Few-Shot Approach for LiDAR
Segmentation [0.0]
Deep learning applications on LiDAR data suffer from a strong domain gap when applied to different sensors or tasks.
In practical applications labeled data is costly and time consuming to obtain.
We propose ImageTo360, an effective and streamlined few-shot approach to label-efficient LiDAR segmentation.
arXiv Detail & Related papers (2023-09-12T13:04:41Z) - Hierarchical Supervision and Shuffle Data Augmentation for 3D
Semi-Supervised Object Detection [90.32180043449263]
State-of-the-art 3D object detectors are usually trained on large-scale datasets with high-quality 3D annotations.
A natural remedy is to adopt semi-supervised learning (SSL) by leveraging a limited amount of labeled samples and abundant unlabeled samples.
This paper introduces a novel approach of Hierarchical Supervision and Shuffle Data Augmentation (HSSDA), which is a simple yet effective teacher-student framework.
arXiv Detail & Related papers (2023-04-04T02:09:32Z) - Context-Aware Data Augmentation for LIDAR 3D Object Detection [4.084927826063192]
GT-sample effectively improves detection performance by inserting groundtruths into the lidar frame during training.
These samples are often placed in unreasonable areas, which misleads model to learn the wrong context information between targets and backgrounds.
We propose a context-aware data augmentation method (CA-aug) which ensures the reasonable placement of inserted objects.
arXiv Detail & Related papers (2022-11-20T02:45:18Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Bridging the Reality Gap for Pose Estimation Networks using Sensor-Based
Domain Randomization [1.4290119665435117]
Methods trained on synthetic data use 2D images, as domain randomization in 2D is more developed.
Our method integrates the 3D data into the network to increase the accuracy of the pose estimation.
Experiments on three large pose estimation benchmarks show that the presented method outperforms previous methods trained on synthetic data.
arXiv Detail & Related papers (2020-11-17T09:12:11Z) - SelfVoxeLO: Self-supervised LiDAR Odometry with Voxel-based Deep Neural
Networks [81.64530401885476]
We propose a self-supervised LiDAR odometry method, dubbed SelfVoxeLO, to tackle these two difficulties.
Specifically, we propose a 3D convolution network to process the raw LiDAR data directly, which extracts features that better encode the 3D geometric patterns.
We evaluate our method's performances on two large-scale datasets, i.e., KITTI and Apollo-SouthBay.
arXiv Detail & Related papers (2020-10-19T09:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.