Bootstrap Your Own Correspondences
- URL: http://arxiv.org/abs/2106.00677v1
- Date: Tue, 1 Jun 2021 17:59:08 GMT
- Title: Bootstrap Your Own Correspondences
- Authors: Mohamed El Banani, Justin Johnson
- Abstract summary: BYOC is a self-supervised approach that learns visual and geometric features from RGB-D video without relying on ground-truth pose or correspondence.
We evaluate our approach on indoor scene datasets and find that our method outperforms traditional and learned descriptors.
- Score: 15.715143016999695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Geometric feature extraction is a crucial component of point cloud
registration pipelines. Recent work has demonstrated how supervised learning
can be leveraged to learn better and more compact 3D features. However, those
approaches' reliance on ground-truth annotation limits their scalability. We
propose BYOC: a self-supervised approach that learns visual and geometric
features from RGB-D video without relying on ground-truth pose or
correspondence. Our key observation is that randomly-initialized CNNs readily
provide us with good correspondences; allowing us to bootstrap the learning of
both visual and geometric features. Our approach combines classic ideas from
point cloud registration with more recent representation learning approaches.
We evaluate our approach on indoor scene datasets and find that our method
outperforms traditional and learned descriptors, while being competitive with
current state-of-the-art supervised approaches.
Related papers
- GPr-Net: Geometric Prototypical Network for Point Cloud Few-Shot
Learning [2.4366811507669115]
GPr-Net is a lightweight and computationally efficient geometric network that captures the prototypical topology of point clouds.
We show that GPr-Net outperforms state-of-the-art methods in few-shot learning on point clouds.
arXiv Detail & Related papers (2023-04-12T17:32:18Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - PointCaM: Cut-and-Mix for Open-Set Point Cloud Learning [72.07350827773442]
We propose to solve open-set point cloud learning using a novel Point Cut-and-Mix mechanism.
We use the Unknown-Point Simulator to simulate out-of-distribution data in the training stage.
The Unknown-Point Estimator module learns to exploit the point cloud's feature context for discriminating the known and unknown data.
arXiv Detail & Related papers (2022-12-05T03:53:51Z) - Self-Supervised Visual Place Recognition by Mining Temporal and Feature
Neighborhoods [17.852415436033436]
We propose a novel framework named textitTF-VPR that uses temporal neighborhoods and learnable feature neighborhoods to discover unknown spatial neighborhoods.
Our method follows an iterative training paradigm which alternates between: (1) representation learning with data augmentation, (2) positive set expansion to include the current feature space neighbors, and (3) positive set contraction via geometric verification.
arXiv Detail & Related papers (2022-08-19T12:59:46Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - UPDesc: Unsupervised Point Descriptor Learning for Robust Registration [54.95201961399334]
UPDesc is an unsupervised method to learn point descriptors for robust point cloud registration.
We show that our learned descriptors yield superior performance over existing unsupervised methods.
arXiv Detail & Related papers (2021-08-05T17:11:08Z) - Revisiting Contrastive Methods for Unsupervised Learning of Visual
Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection.
In this paper, we first study how biases in the dataset affect existing methods.
We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z) - Semantic Graph Based Place Recognition for 3D Point Clouds [22.608115489674653]
This paper presents a novel semantic graph based approach for place recognition.
First, we propose a novel semantic graph representation for the point cloud scenes.
We then design a fast and effective graph similarity network to compute the similarity.
arXiv Detail & Related papers (2020-08-26T09:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.