Generalized 3D Self-supervised Learning Framework via Prompted
Foreground-Aware Feature Contrast
- URL: http://arxiv.org/abs/2303.06388v4
- Date: Fri, 1 Dec 2023 15:52:32 GMT
- Title: Generalized 3D Self-supervised Learning Framework via Prompted
Foreground-Aware Feature Contrast
- Authors: Kangcheng Liu, Xinhu Zheng, Chaoqun Wang, Kai Tang, Ming Liu, Baoquan
Chen
- Abstract summary: We propose a general foreground-aware feature contrast FAC++ framework to learn more effective point cloud representations in pre-training.
We prevent over-discrimination between 3D segments/objects and encourage grouped foreground-to-background distinctions.
We show that our contrast pairs capture clear correspondences among foreground regions during pre-training.
- Score: 38.34558139249363
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Contrastive learning has recently demonstrated great potential for
unsupervised pre-training in 3D scene understanding tasks. However, most
existing work randomly selects point features as anchors while building
contrast, leading to a clear bias toward background points that often dominate
in 3D scenes. Also, object awareness and foreground-to-background
discrimination are neglected, making contrastive learning less effective. To
tackle these issues, we propose a general foreground-aware feature contrast
FAC++ framework to learn more effective point cloud representations in
pre-training. FAC++ consists of two novel contrast designs to construct more
effective and informative contrast pairs. The first is building positive pairs
within the same foreground segment where points tend to have the same
semantics. The second is that we prevent over-discrimination between 3D
segments/objects and encourage grouped foreground-to-background distinctions at
the segment level with adaptive feature learning in a Siamese correspondence
network, which adaptively learns feature correlations within and across point
cloud views effectively. Moreover, we have designed the foreground-prompted
regional sampling to enhance more balanced foreground-aware learning, which is
termed FAC++. Visualization with point activation maps shows that our contrast
pairs capture clear correspondences among foreground regions during
pre-training. Quantitative experiments also show that FAC++ achieves superior
knowledge transfer and data efficiency in various downstream 3D semantic
segmentation, instance segmentation as well as object detection tasks. All
codes, data, and models are available at:
https://github.com/KangchengLiu/FAC_Foreground_Aware_Contrast
Related papers
- Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes.
We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data.
We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z) - Cross-Modal Information-Guided Network using Contrastive Learning for
Point Cloud Registration [17.420425069785946]
We present a novel Cross-Modal Information-Guided Network (CMIGNet) for point cloud registration.
We first incorporate the projected images from the point clouds and fuse the cross-modal features using the attention mechanism.
We employ two contrastive learning strategies, namely overlapping contrastive learning and cross-modal contrastive learning.
arXiv Detail & Related papers (2023-11-02T12:56:47Z) - Point-GCC: Universal Self-supervised 3D Scene Pre-training via
Geometry-Color Contrast [9.14535402695962]
Geometry and color information provided by point clouds are crucial for 3D scene understanding.
We propose a universal 3D scene pre-training framework via Geometry-Color Contrast (Point-GCC)
Point-GCC aligns geometry and color information using a Siamese network.
arXiv Detail & Related papers (2023-05-31T07:44:03Z) - Masked Scene Contrast: A Scalable Framework for Unsupervised 3D
Representation Learning [37.155772047656114]
Masked Scene Contrast (MSC) framework is capable of extracting comprehensive 3D representations more efficiently and effectively.
MSC also enables large-scale 3D pre-training across multiple datasets.
arXiv Detail & Related papers (2023-03-24T17:59:58Z) - CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP [55.864132158596206]
Contrastive Language-Image Pre-training (CLIP) achieves promising results in 2D zero-shot and few-shot learning.
We make the first attempt to investigate how CLIP knowledge benefits 3D scene understanding.
We propose CLIP2Scene, a framework that transfers CLIP knowledge from 2D image-text pre-trained models to a 3D point cloud network.
arXiv Detail & Related papers (2023-01-12T10:42:39Z) - CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D
Point Cloud Understanding [2.8661021832561757]
CrossPoint is a simple cross-modal contrastive learning approach to learn transferable 3D point cloud representations.
Our approach outperforms the previous unsupervised learning methods on a diverse range of downstream tasks including 3D object classification and segmentation.
arXiv Detail & Related papers (2022-03-01T18:59:01Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.