Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds
- URL: http://arxiv.org/abs/2102.04136v1
- Date: Mon, 8 Feb 2021 11:29:57 GMT
- Title: Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds
- Authors: Jo\"el Bachmann, Kenneth Blomqvist, Julian F\"orster, Roland Siegwart
- Abstract summary: Similar representation learning techniques have not yet become commonplace in the context of 3D vision.
We learn these vector representations by mining a dataset of scanned 3D spaces using an unsupervised algorithm.
We show that using our method to include context increases the ability of a clustering algorithm to distinguish different semantic classes from each other.
- Score: 25.988556827312483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised representation learning techniques, such as learning word
embeddings, have had a significant impact on the field of natural language
processing. Similar representation learning techniques have not yet become
commonplace in the context of 3D vision. This, despite the fact that the
physical 3D spaces have a similar semantic structure to bodies of text: words
are surrounded by words that are semantically related, just like objects are
surrounded by other objects that are similar in concept and usage.
In this work, we exploit this structure in learning semantically meaningful
low dimensional vector representations of objects. We learn these vector
representations by mining a dataset of scanned 3D spaces using an unsupervised
algorithm. We represent objects as point clouds, a flexible and general
representation for 3D data, which we encode into a vector representation. We
show that using our method to include context increases the ability of a
clustering algorithm to distinguish different semantic classes from each other.
Furthermore, we show that our algorithm produces continuous and meaningful
object embeddings through interpolation experiments.
Related papers
- SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR.
SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds.
We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z) - Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly
Supervised 3D Visual Grounding [58.924180772480504]
3D visual grounding involves finding a target object in a 3D scene that corresponds to a given sentence query.
We propose to leverage weakly supervised annotations to learn the 3D visual grounding model.
We design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.
arXiv Detail & Related papers (2023-07-18T13:49:49Z) - Semantic Abstraction: Open-World 3D Scene Understanding from 2D
Vision-Language Models [17.606199768716532]
We study open-world 3D scene understanding, a family of tasks that require agents to reason about their 3D environment with an open-set vocabulary and out-of-domain visual inputs.
We propose Semantic Abstraction (SemAbs), a framework that equips 2D Vision-Language Models with new 3D spatial capabilities.
We demonstrate the usefulness of SemAbs on two open-world 3D scene understanding tasks.
arXiv Detail & Related papers (2022-07-23T13:10:25Z) - 3D Concept Grounding on Neural Fields [99.33215488324238]
Existing visual reasoning approaches typically utilize supervised methods to extract 2D segmentation masks on which concepts are grounded.
Humans are capable of grounding concepts on the underlying 3D representation of images.
We propose to leverage the continuous, differentiable nature of neural fields to segment and learn concepts.
arXiv Detail & Related papers (2022-07-13T17:59:33Z) - Self-Supervised Learning of Object Parts for Semantic Segmentation [7.99536002595393]
We argue that self-supervised learning of object parts is a solution to this issue.
Our method surpasses the state-of-the-art on three semantic segmentation benchmarks by 17%-3%.
arXiv Detail & Related papers (2022-04-27T17:55:17Z) - Learning to Reconstruct and Segment 3D Objects [4.709764624933227]
We aim to understand scenes and the objects within them by learning general and robust representations using deep neural networks.
This thesis makes three core contributions from object-level 3D shape estimation from single or multiple views to scene-level semantic understanding.
arXiv Detail & Related papers (2020-10-19T15:09:04Z) - Semantic Correspondence via 2D-3D-2D Cycle [58.023058561837686]
We propose a new method on predicting semantic correspondences by leveraging it to 3D domain.
We show that our method gives comparative and even superior results on standard semantic benchmarks.
arXiv Detail & Related papers (2020-04-20T05:27:45Z) - Global-Local Bidirectional Reasoning for Unsupervised Representation
Learning of 3D Point Clouds [109.0016923028653]
We learn point cloud representation by bidirectional reasoning between the local structures and the global shape without human supervision.
We show that our unsupervised model surpasses the state-of-the-art supervised methods on both synthetic and real-world 3D object classification datasets.
arXiv Detail & Related papers (2020-03-29T08:26:08Z) - Human Correspondence Consensus for 3D Object Semantic Understanding [56.34297279246823]
In this paper, we introduce a new dataset named CorresPondenceNet.
Based on this dataset, we are able to learn dense semantic embeddings with a novel geodesic consistency loss.
We show that CorresPondenceNet could not only boost fine-grained understanding of heterogeneous objects but also cross-object registration and partial object matching.
arXiv Detail & Related papers (2019-12-29T04:24:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.