Spherical Transformer: Adapting Spherical Signal to CNNs
- URL: http://arxiv.org/abs/2101.03848v2
- Date: Sun, 24 Jan 2021 10:57:25 GMT
- Title: Spherical Transformer: Adapting Spherical Signal to CNNs
- Authors: Haikuan Du and Hui Cao and Shen Cai and Junchi Yan and Siyu Zhang
- Abstract summary: Spherical Transformer can transform spherical signals into vectors that can be directly processed by standard CNNs.
We evaluate our approach on the tasks of spherical MNIST recognition, 3D object classification and omnidirectional image semantic segmentation.
- Score: 53.18482213611481
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural networks (CNNs) have been widely used in various vision
tasks, e.g. image classification, semantic segmentation, etc. Unfortunately,
standard 2D CNNs are not well suited for spherical signals such as panorama
images or spherical projections, as the sphere is an unstructured grid. In this
paper, we present Spherical Transformer which can transform spherical signals
into vectors that can be directly processed by standard CNNs such that many
well-designed CNNs architectures can be reused across tasks and datasets by
pretraining. To this end, the proposed method first uses locally structured
sampling methods such as HEALPix to construct a transformer grid by using the
information of spherical points and its adjacent points, and then transforms
the spherical signals to the vectors through the grid. By building the
Spherical Transformer module, we can use multiple CNN architectures directly.
We evaluate our approach on the tasks of spherical MNIST recognition, 3D object
classification and omnidirectional image semantic segmentation. For 3D object
classification, we further propose a rendering-based projection method to
improve the performance and a rotational-equivariant model to improve the
anti-rotation ability. Experimental results on three tasks show that our
approach achieves superior performance over state-of-the-art methods.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes [10.667492516216887]
We propose a transformer-based method for semantic segmentation of 3D mesh.
We perform positional encoding by means of the Laplacian eigenvectors of the adjacency matrix.
We show how the proposed approach yields state-of-the-art performance on semantic segmentation of 3D meshes.
arXiv Detail & Related papers (2023-07-03T15:45:14Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - Implicit Ray-Transformers for Multi-view Remote Sensing Image
Segmentation [26.726658200149544]
We propose ''Implicit Ray-Transformer (IRT)'' based on Implicit Neural Representation (INR) for RS scene semantic segmentation with sparse labels.
The proposed method includes a two-stage learning process. In the first stage, we optimize a neural field to encode the color and 3D structure of the remote sensing scene.
In the second stage, we design a Ray Transformer to leverage the relations between the neural field 3D features and 2D texture features for learning better semantic representations.
arXiv Detail & Related papers (2023-03-15T07:05:07Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z) - Concentric Spherical GNN for 3D Representation Learning [53.45704095146161]
We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps.
Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information.
We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data.
arXiv Detail & Related papers (2021-03-18T19:05:04Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.