Related papers: Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds

Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds

URL: http://arxiv.org/abs/2102.04136v1
Date: Mon, 8 Feb 2021 11:29:57 GMT
Title: Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds
Authors: Jo\"el Bachmann, Kenneth Blomqvist, Julian F\"orster, Roland Siegwart
Abstract summary: Similar representation learning techniques have not yet become commonplace in the context of 3D vision. We learn these vector representations by mining a dataset of scanned 3D spaces using an unsupervised algorithm. We show that using our method to include context increases the ability of a clustering algorithm to distinguish different semantic classes from each other.
Score: 25.988556827312483
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised representation learning techniques, such as learning word embeddings, have had a significant impact on the field of natural language processing. Similar representation learning techniques have not yet become commonplace in the context of 3D vision. This, despite the fact that the physical 3D spaces have a similar semantic structure to bodies of text: words are surrounded by words that are semantically related, just like objects are surrounded by other objects that are similar in concept and usage. In this work, we exploit this structure in learning semantically meaningful low dimensional vector representations of objects. We learn these vector representations by mining a dataset of scanned 3D spaces using an unsupervised algorithm. We represent objects as point clouds, a flexible and general representation for 3D data, which we encode into a vector representation. We show that using our method to include context increases the ability of a clustering algorithm to distinguish different semantic classes from each other. Furthermore, we show that our algorithm produces continuous and meaningful object embeddings through interpolation experiments.

Related papers

SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR. SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds. We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z)
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding [58.924180772480504]
3D visual grounding involves finding a target object in a 3D scene that corresponds to a given sentence query. We propose to leverage weakly supervised annotations to learn the 3D visual grounding model. We design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.
arXiv Detail & Related papers (2023-07-18T13:49:49Z)
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models [17.606199768716532]
We study open-world 3D scene understanding, a family of tasks that require agents to reason about their 3D environment with an open-set vocabulary and out-of-domain visual inputs. We propose Semantic Abstraction (SemAbs), a framework that equips 2D Vision-Language Models with new 3D spatial capabilities. We demonstrate the usefulness of SemAbs on two open-world 3D scene understanding tasks.
arXiv Detail & Related papers (2022-07-23T13:10:25Z)
3D Concept Grounding on Neural Fields [99.33215488324238]
Existing visual reasoning approaches typically utilize supervised methods to extract 2D segmentation masks on which concepts are grounded. Humans are capable of grounding concepts on the underlying 3D representation of images. We propose to leverage the continuous, differentiable nature of neural fields to segment and learn concepts.
arXiv Detail & Related papers (2022-07-13T17:59:33Z)
Self-Supervised Learning of Object Parts for Semantic Segmentation [7.99536002595393]
We argue that self-supervised learning of object parts is a solution to this issue. Our method surpasses the state-of-the-art on three semantic segmentation benchmarks by 17%-3%.
arXiv Detail & Related papers (2022-04-27T17:55:17Z)
Learning to Reconstruct and Segment 3D Objects [4.709764624933227]
We aim to understand scenes and the objects within them by learning general and robust representations using deep neural networks. This thesis makes three core contributions from object-level 3D shape estimation from single or multiple views to scene-level semantic understanding.
arXiv Detail & Related papers (2020-10-19T15:09:04Z)
Semantic Correspondence via 2D-3D-2D Cycle [58.023058561837686]
We propose a new method on predicting semantic correspondences by leveraging it to 3D domain. We show that our method gives comparative and even superior results on standard semantic benchmarks.
arXiv Detail & Related papers (2020-04-20T05:27:45Z)
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds [109.0016923028653]
We learn point cloud representation by bidirectional reasoning between the local structures and the global shape without human supervision. We show that our unsupervised model surpasses the state-of-the-art supervised methods on both synthetic and real-world 3D object classification datasets.
arXiv Detail & Related papers (2020-03-29T08:26:08Z)
Human Correspondence Consensus for 3D Object Semantic Understanding [56.34297279246823]
In this paper, we introduce a new dataset named CorresPondenceNet. Based on this dataset, we are able to learn dense semantic embeddings with a novel geodesic consistency loss. We show that CorresPondenceNet could not only boost fine-grained understanding of heterogeneous objects but also cross-object registration and partial object matching.
arXiv Detail & Related papers (2019-12-29T04:24:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.