TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis
- URL: http://arxiv.org/abs/2211.14456v6
- Date: Mon, 25 Mar 2024 17:58:59 GMT
- Title: TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis
- Authors: Pavlo Melnyk, Andreas Robinson, Michael Felsberg, Mårten Wadenbäck,
- Abstract summary: We present a learnable descriptor invariant under 3D rotations and reflections, i.e., the O(3) actions.
We propose an embedding of the 3D spherical neurons into 4D vector neurons, which leverages end-to-end training of the model.
Our results reveal the practical value of steerable 3D spherical neurons for learning in 3D Euclidean space.
- Score: 19.322295753674844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many practical applications, 3D point cloud analysis requires rotation invariance. In this paper, we present a learnable descriptor invariant under 3D rotations and reflections, i.e., the O(3) actions, utilizing the recently introduced steerable 3D spherical neurons and vector neurons. Specifically, we propose an embedding of the 3D spherical neurons into 4D vector neurons, which leverages end-to-end training of the model. In our approach, we perform TetraTransform--an equivariant embedding of the 3D input into 4D, constructed from the steerable neurons--and extract deeper O(3)-equivariant features using vector neurons. This integration of the TetraTransform into the VN-DGCNN framework, termed TetraSphere, negligibly increases the number of parameters by less than 0.0002%. TetraSphere sets a new state-of-the-art performance classifying randomly rotated real-world object scans of the challenging subsets of ScanObjectNN. Additionally, TetraSphere outperforms all equivariant methods on randomly rotated synthetic data: classifying objects from ModelNet40 and segmenting parts of the ShapeNet shapes. Thus, our results reveal the practical value of steerable 3D spherical neurons for learning in 3D Euclidean space. The code is available at https://github.com/pavlo-melnyk/tetrasphere.
Related papers
- 3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis [49.352765055181436]
We propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis.
Our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction.
arXiv Detail & Related papers (2024-04-09T12:47:30Z) - Parameterization-driven Neural Surface Reconstruction for Object-oriented Editing in Neural Rendering [35.69582529609475]
This paper introduces a novel neural algorithm for parameterizing neural implicit surfaces to simple parametric domains like spheres and polycubes.
It computes bi-directional deformation between the object and the domain using a forward mapping from the object's zero level set and an inverse deformation for backward mapping.
We demonstrate the method's effectiveness on images of human heads and man-made objects.
arXiv Detail & Related papers (2023-10-09T08:42:40Z) - Self-supervised Learning of Rotation-invariant 3D Point Set Features using Transformer and its Self-distillation [3.1652399282742536]
This paper proposes a novel self-supervised learning framework for acquiring accurate and rotation-invariant 3D point set features at object-level.
We employ a self-attention mechanism to refine the tokens and aggregate them into an expressive rotation-invariant feature per 3D point set.
Our proposed algorithm learns rotation-invariant 3D point set features that are more accurate than those learned by existing algorithms.
arXiv Detail & Related papers (2023-08-09T06:03:07Z) - ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes [55.689763519293464]
ConDor is a self-supervised method that learns to canonicalize the 3D orientation and position for full and partial 3D point clouds.
During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose.
arXiv Detail & Related papers (2022-01-19T18:57:21Z) - Spatially Invariant Unsupervised 3D Object Segmentation with Graph
Neural Networks [23.729853358582506]
We propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model.
We jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE)
Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information.
arXiv Detail & Related papers (2021-06-10T09:20:16Z) - Fully Steerable 3D Spherical Neurons [14.86655504533083]
We propose a steerable feed-forward learning-based approach that consists of spherical decision surfaces and operates on point clouds.
Due to the inherent geometric 3D structure of our theory, we derive a 3D steerability constraint for its atomic parts.
We show how the model parameters are fully steerable at inference time.
arXiv Detail & Related papers (2021-06-02T16:30:02Z) - Rotation-Invariant Autoencoders for Signals on Spheres [10.406659081400354]
We study the problem of unsupervised learning of rotation-invariant representations for spherical images.
In particular, we design an autoencoder architecture consisting of $S2$ and $SO(3)$ convolutional layers.
Experiments on multiple datasets demonstrate the usefulness of the learned representations on clustering, retrieval and classification applications.
arXiv Detail & Related papers (2020-12-08T15:15:03Z) - Spin-Weighted Spherical CNNs [58.013031812072356]
We present a new type of spherical CNN that allows anisotropic filters in an efficient way, without ever leaving the sphere domain.
The key idea is to consider spin-weighted spherical functions, which were introduced in physics in the study of gravitational waves.
Our method outperforms previous methods on tasks like classification of spherical images, classification of 3D shapes and semantic segmentation of spherical panoramas.
arXiv Detail & Related papers (2020-06-18T17:57:21Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z) - V4D:4D Convolutional Neural Networks for Video-level Representation
Learning [58.548331848942865]
Most 3D CNNs for video representation learning are clip-based, and thus do not consider video-temporal evolution of features.
We propose Video-level 4D Conal Neural Networks, or V4D, to model long-range representation with 4D convolutions.
V4D achieves excellent results, surpassing recent 3D CNNs by a large margin.
arXiv Detail & Related papers (2020-02-18T09:27:41Z) - Quaternion Equivariant Capsule Networks for 3D Point Clouds [58.566467950463306]
We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations.
We connect dynamic routing between capsules to the well-known Weiszfeld algorithm.
Based on our operator, we build a capsule network that disentangles geometry from pose.
arXiv Detail & Related papers (2019-12-27T13:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.