Zero-shot point cloud segmentation by transferring geometric primitives
- URL: http://arxiv.org/abs/2210.09923v3
- Date: Fri, 29 Sep 2023 04:03:28 GMT
- Title: Zero-shot point cloud segmentation by transferring geometric primitives
- Authors: Runnan Chen, Xinge Zhu, Nenglun Chen, Wei Li, Yuexin Ma, Ruigang Yang,
Wenping Wang
- Abstract summary: We investigate zero-shot point cloud semantic segmentation, where the network is trained on seen objects and able to segment unseen objects.
We propose a novel framework to learn the geometric primitives shared in seen and unseen categories' objects and employ a fine-grained alignment between language and the learned geometric primitives.
- Score: 68.18710039217336
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We investigate transductive zero-shot point cloud semantic segmentation,
where the network is trained on seen objects and able to segment unseen
objects. The 3D geometric elements are essential cues to imply a novel 3D
object type. However, previous methods neglect the fine-grained relationship
between the language and the 3D geometric elements. To this end, we propose a
novel framework to learn the geometric primitives shared in seen and unseen
categories' objects and employ a fine-grained alignment between language and
the learned geometric primitives. Therefore, guided by language, the network
recognizes the novel objects represented with geometric primitives.
Specifically, we formulate a novel point visual representation, the similarity
vector of the point's feature to the learnable prototypes, where the prototypes
automatically encode geometric primitives via back-propagation. Besides, we
propose a novel Unknown-aware InfoNCE Loss to fine-grained align the visual
representation with language. Extensive experiments show that our method
significantly outperforms other state-of-the-art methods in the harmonic
mean-intersection-over-union (hIoU), with the improvement of 17.8\%, 30.4\%,
9.2\% and 7.9\% on S3DIS, ScanNet, SemanticKITTI and nuScenes datasets,
respectively. Codes are available
(https://github.com/runnanchen/Zero-Shot-Point-Cloud-Segmentation)
Related papers
- SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding [11.416392706435415]
Zero-shot 3D point cloud understanding can be achieved via 2D Vision-Language Models (VLMs)
Existing strategies directly map Vision-Language Models from 2D pixels of rendered or captured views to 3D points, overlooking the inherent and expressible point cloud geometric structure.
We introduce the first training-free aggregation technique that leverages the point cloud's 3D geometric structure to improve the quality of the transferred Vision-Language Models.
arXiv Detail & Related papers (2023-12-04T12:30:07Z) - QuadricsNet: Learning Concise Representation for Geometric Primitives in
Point Clouds [39.600071233251704]
This paper presents a novel framework to learn a concise geometric primitive representation for 3D point clouds.
We employ quadrics to represent diverse primitives with only 10 parameters.
We propose the first end-to-end learning-based framework, namely QuadricsNet, to parse quadrics in point clouds.
arXiv Detail & Related papers (2023-09-25T15:18:08Z) - Generalized Few-Shot Point Cloud Segmentation Via Geometric Words [54.32239996417363]
Few-shot point cloud segmentation algorithms learn to adapt to new classes at the sacrifice of segmentation accuracy for the base classes.
We present the first attempt at a more practical paradigm of generalized few-shot point cloud segmentation.
We propose the geometric words to represent geometric components shared between the base and novel classes, and incorporate them into a novel geometric-aware semantic representation.
arXiv Detail & Related papers (2023-09-20T11:24:33Z) - PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object
Tracking? [62.997667081978825]
We encode 3D detections as nodes in a graph, where spatial and temporal pairwise relations among objects are encoded via localized polar coordinates on graph edges.
This allows our graph neural network to learn to effectively encode temporal and spatial interactions.
We establish a new state-of-the-art on nuScenes dataset and, more importantly, show that our method, PolarMOT, generalizes remarkably well across different locations.
arXiv Detail & Related papers (2022-08-03T10:06:56Z) - Fitting and recognition of geometric primitives in segmented 3D point
clouds using a localized voting procedure [1.8352113484137629]
We introduce a novel technique for processing point clouds that, through a voting procedure, is able to provide an initial estimate of the primitive parameters each type.
By using these estimates we localize the search of the optimal solution in a dimensionally-reduced space, making it efficient to extend the HT to more primitive than those that generally found in the literature.
arXiv Detail & Related papers (2022-05-30T20:47:43Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - Deep Geometric Texture Synthesis [83.9404865744028]
We propose a novel framework for synthesizing geometric textures.
It learns texture statistics from local neighborhoods of a single reference 3D model.
Our network displaces mesh vertices in any direction, enabling synthesis of geometric textures.
arXiv Detail & Related papers (2020-06-30T19:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.