Less is More: Reducing Task and Model Complexity for 3D Point Cloud
Semantic Segmentation
- URL: http://arxiv.org/abs/2303.11203v2
- Date: Tue, 28 Mar 2023 14:56:30 GMT
- Title: Less is More: Reducing Task and Model Complexity for 3D Point Cloud
Semantic Segmentation
- Authors: Li Li, Hubert P. H. Shum, Toby P. Breckon
- Abstract summary: New pipeline requires fewer ground-truth annotations to achieve superior segmentation accuracy.
New Sparse Depthwise Separable Convolution module significantly reduces the network parameter count.
New Spatio-Temporal Redundant Frame Downsampling (ST-RFD) method extracts a more diverse subset of training data frame samples.
- Score: 26.94284739177754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Whilst the availability of 3D LiDAR point cloud data has significantly grown
in recent years, annotation remains expensive and time-consuming, leading to a
demand for semi-supervised semantic segmentation methods with application
domains such as autonomous driving. Existing work very often employs relatively
large segmentation backbone networks to improve segmentation accuracy, at the
expense of computational costs. In addition, many use uniform sampling to
reduce ground truth data requirements for learning needed, often resulting in
sub-optimal performance. To address these issues, we propose a new pipeline
that employs a smaller architecture, requiring fewer ground-truth annotations
to achieve superior segmentation accuracy compared to contemporary approaches.
This is facilitated via a novel Sparse Depthwise Separable Convolution module
that significantly reduces the network parameter count while retaining overall
task performance. To effectively sub-sample our training data, we propose a new
Spatio-Temporal Redundant Frame Downsampling (ST-RFD) method that leverages
knowledge of sensor motion within the environment to extract a more diverse
subset of training data frame samples. To leverage the use of limited annotated
data samples, we further propose a soft pseudo-label method informed by LiDAR
reflectivity. Our method outperforms contemporary semi-supervised work in terms
of mIoU, using less labeled data, on the SemanticKITTI (59.5@5%) and
ScribbleKITTI (58.1@5%) benchmark datasets, based on a 2.3x reduction in model
parameters and 641x fewer multiply-add operations whilst also demonstrating
significant performance improvement on limited training data (i.e., Less is
More).
Related papers
- Exploiting Local Features and Range Images for Small Data Real-Time Point Cloud Semantic Segmentation [4.02235104503587]
In this paper, we harness the information from the three-dimensional representation to proficiently capture local features.
A GPU-based KDTree allows for rapid building, querying, and enhancing projection with straightforward operations.
We show that a reduced version of our model not only demonstrates strong competitiveness against full-scale state-of-the-art models but also operates in real-time.
arXiv Detail & Related papers (2024-10-14T13:49:05Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Exploring Learning Complexity for Efficient Downstream Dataset Pruning [8.990878450631596]
Existing dataset pruning methods require training on the entire dataset.
We propose a straightforward, novel, and training-free hardness score named Distorting-based Learning Complexity (DLC)
Our method is motivated by the observation that easy samples learned faster can also be learned with fewer parameters.
arXiv Detail & Related papers (2024-02-08T02:29:33Z) - KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training [2.8804804517897935]
We propose a method for hiding the least-important samples during the training of deep neural networks.
We adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process.
Our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.
arXiv Detail & Related papers (2023-10-16T06:19:29Z) - Scaling Relationship on Learning Mathematical Reasoning with Large
Language Models [75.29595679428105]
We investigate how the pre-training loss, supervised data amount, and augmented data amount influence the reasoning performances of a supervised LLM.
We find that rejection samples from multiple models push LLaMA-7B to an accuracy of 49.3% on GSM8K which outperforms the supervised fine-tuning (SFT) accuracy of 35.9% significantly.
arXiv Detail & Related papers (2023-08-03T15:34:01Z) - A New Benchmark: On the Utility of Synthetic Data with Blender for Bare
Supervised Learning and Downstream Domain Adaptation [42.2398858786125]
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data.
The uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist.
To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization.
arXiv Detail & Related papers (2023-03-16T09:03:52Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.