Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck
- URL: http://arxiv.org/abs/2109.08553v1
- Date: Fri, 17 Sep 2021 13:54:20 GMT
- Title: Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck
- Authors: Liyi Luo, Beiwen Tian, Hao Zhao and Guyue Zhou
- Abstract summary: Given that point-wise semantic annotation is expensive, in this paper, we address the challenge of learning models with extremely sparse labels.
We propose a self-supervised 3D representation learning framework named viewpoint bottleneck.
- Score: 3.2790748006553643
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Semantic understanding of 3D point clouds is important for various robotics
applications. Given that point-wise semantic annotation is expensive, in this
paper, we address the challenge of learning models with extremely sparse
labels. The core problem is how to leverage numerous unlabeled points. To this
end, we propose a self-supervised 3D representation learning framework named
viewpoint bottleneck. It optimizes a mutual-information based objective, which
is applied on point clouds under different viewpoints. A principled analysis
shows that viewpoint bottleneck leads to an elegant surrogate loss function
that is suitable for large-scale point cloud data. Compared with former arts
based upon contrastive learning, viewpoint bottleneck operates on the feature
dimension instead of the sample dimension. This paradigm shift has several
advantages: It is easy to implement and tune, does not need negative samples
and performs better on our goal down-streaming task. We evaluate our method on
the public benchmark ScanNet, under the pointly-supervised setting. We achieve
the best quantitative results among comparable solutions. Meanwhile we provide
an extensive qualitative inspection on various challenging scenes. They
demonstrate that our models can produce fairly good scene parsing results for
robotics applications. Our code, data and models will be made public.
Related papers
- Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly
Supervised 3D Visual Grounding [58.924180772480504]
3D visual grounding involves finding a target object in a 3D scene that corresponds to a given sentence query.
We propose to leverage weakly supervised annotations to learn the 3D visual grounding model.
We design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.
arXiv Detail & Related papers (2023-07-18T13:49:49Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and
Uncertainty-Spectrum Modeling [2.0624279915507047]
Training 3D scene parsing models with sparse supervision is an intriguing alternative.
We term this task as data-efficient 3D scene parsing.
We propose an effective two-stage framework named VIBUS to resolve it.
arXiv Detail & Related papers (2022-10-20T17:59:57Z) - Let Images Give You More:Point Cloud Cross-Modal Training for Shape
Analysis [43.13887916301742]
This paper introduces a simple but effective point cloud cross-modality training (PointCMT) strategy to boost point cloud analysis.
To effectively acquire auxiliary knowledge from view images, we develop a teacher-student framework and formulate the cross modal learning as a knowledge distillation problem.
We verify significant gains on various datasets using appealing backbones, i.e., equipped with PointCMT, PointNet++ and PointMLP.
arXiv Detail & Related papers (2022-10-09T09:35:22Z) - Data Augmentation-free Unsupervised Learning for 3D Point Cloud
Understanding [61.30276576646909]
We propose an augmentation-free unsupervised approach for point clouds to learn transferable point-level features via soft clustering, named SoftClu.
We exploit the affiliation of points to their clusters as a proxy to enable self-training through a pseudo-label prediction task.
arXiv Detail & Related papers (2022-10-06T10:18:16Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - Semi-supervised 3D Object Detection via Adaptive Pseudo-Labeling [18.209409027211404]
3D object detection is an important task in computer vision.
Most existing methods require a large number of high-quality 3D annotations, which are expensive to collect.
We propose a novel semi-supervised framework based on pseudo-labeling for outdoor 3D object detection tasks.
arXiv Detail & Related papers (2021-08-15T02:58:43Z) - Point Discriminative Learning for Unsupervised Representation Learning
on 3D Point Clouds [54.31515001741987]
We propose a point discriminative learning method for unsupervised representation learning on 3D point clouds.
We achieve this by imposing a novel point discrimination loss on the middle level and global level point features.
Our method learns powerful representations and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2021-08-04T15:11:48Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.