Related papers: Spherical Transformer: Adapting Spherical Signal to CNNs

Spherical Transformer: Adapting Spherical Signal to CNNs

URL: http://arxiv.org/abs/2101.03848v2
Date: Sun, 24 Jan 2021 10:57:25 GMT
Title: Spherical Transformer: Adapting Spherical Signal to CNNs
Authors: Haikuan Du and Hui Cao and Shen Cai and Junchi Yan and Siyu Zhang
Abstract summary: Spherical Transformer can transform spherical signals into vectors that can be directly processed by standard CNNs. We evaluate our approach on the tasks of spherical MNIST recognition, 3D object classification and omnidirectional image semantic segmentation.
Score: 53.18482213611481
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional neural networks (CNNs) have been widely used in various vision tasks, e.g. image classification, semantic segmentation, etc. Unfortunately, standard 2D CNNs are not well suited for spherical signals such as panorama images or spherical projections, as the sphere is an unstructured grid. In this paper, we present Spherical Transformer which can transform spherical signals into vectors that can be directly processed by standard CNNs such that many well-designed CNNs architectures can be reused across tasks and datasets by pretraining. To this end, the proposed method first uses locally structured sampling methods such as HEALPix to construct a transformer grid by using the information of spherical points and its adjacent points, and then transforms the spherical signals to the vectors through the grid. By building the Spherical Transformer module, we can use multiple CNN architectures directly. We evaluate our approach on the tasks of spherical MNIST recognition, 3D object classification and omnidirectional image semantic segmentation. For 3D object classification, we further propose a rendering-based projection method to improve the performance and a rotational-equivariant model to improve the anti-rotation ability. Experimental results on three tasks show that our approach achieves superior performance over state-of-the-art methods.

Related papers

Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
MeT: A Graph Transformer for Semantic Segmentation of 3D Meshes [10.667492516216887]
We propose a transformer-based method for semantic segmentation of 3D mesh. We perform positional encoding by means of the Laplacian eigenvectors of the adjacency matrix. We show how the proposed approach yields state-of-the-art performance on semantic segmentation of 3D meshes.
arXiv Detail & Related papers (2023-07-03T15:45:14Z)
SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences. It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z)
Implicit Ray-Transformers for Multi-view Remote Sensing Image Segmentation [26.726658200149544]
We propose ''Implicit Ray-Transformer (IRT)'' based on Implicit Neural Representation (INR) for RS scene semantic segmentation with sparse labels. The proposed method includes a two-stage learning process. In the first stage, we optimize a neural field to encode the color and 3D structure of the remote sensing scene. In the second stage, we design a Ray Transformer to leverage the relations between the neural field 3D features and 2D texture features for learning better semantic representations.
arXiv Detail & Related papers (2023-03-15T07:05:07Z)
CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data. CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers. Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z)
Concentric Spherical GNN for 3D Representation Learning [53.45704095146161]
We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps. Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information. We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data.
arXiv Detail & Related papers (2021-03-18T19:05:04Z)
Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications. We propose a local structure-aware anisotropic convolutional operation (LSA-Conv) Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z)
Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space. CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint. Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.