Full Transformer Framework for Robust Point Cloud Registration with Deep
Information Interaction
- URL: http://arxiv.org/abs/2112.09385v1
- Date: Fri, 17 Dec 2021 08:40:52 GMT
- Title: Full Transformer Framework for Robust Point Cloud Registration with Deep
Information Interaction
- Authors: Guangyan Chen, Meiling Wang, Yufeng Yue, Qingxiang Zhang, Li Yuan
- Abstract summary: Recent Transformer-based methods have achieved advanced performance in point cloud registration.
Recent CNNs fail to model global relations due to their local fields receptive.
shallow-wide architecture of Transformers and lack of positional encoding lead to indistinct feature extraction.
- Score: 9.431484068349903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent Transformer-based methods have achieved advanced performance in point
cloud registration by utilizing advantages of the Transformer in
order-invariance and modeling dependency to aggregate information. However,
they still suffer from indistinct feature extraction, sensitivity to noise, and
outliers. The reasons are: (1) the adoption of CNNs fails to model global
relations due to their local receptive fields, resulting in extracted features
susceptible to noise; (2) the shallow-wide architecture of Transformers and
lack of positional encoding lead to indistinct feature extraction due to
inefficient information interaction; (3) the omission of geometrical
compatibility leads to inaccurate classification between inliers and outliers.
To address above limitations, a novel full Transformer network for point cloud
registration is proposed, named the Deep Interaction Transformer (DIT), which
incorporates: (1) a Point Cloud Structure Extractor (PSE) to model global
relations and retrieve structural information with Transformer encoders; (2) a
deep-narrow Point Feature Transformer (PFT) to facilitate deep information
interaction across two point clouds with positional encoding, such that
Transformers can establish comprehensive associations and directly learn
relative position between points; (3) a Geometric Matching-based Correspondence
Confidence Evaluation (GMCCE) method to measure spatial consistency and
estimate inlier confidence by designing the triangulated descriptor. Extensive
experiments on clean, noisy, partially overlapping point cloud registration
demonstrate that our method outperforms state-of-the-art methods.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer [16.674933679692728]
TransPose is a novel 6D pose framework that exploits Transformer with geometry-aware module to develop better learning of point cloud feature representations.
TransPose achieves competitive results on three benchmark datasets.
arXiv Detail & Related papers (2023-10-25T01:24:12Z) - Fourier Test-time Adaptation with Multi-level Consistency for Robust
Classification [10.291631977766672]
We propose a novel approach called Fourier Test-time Adaptation (FTTA) to integrate input and model tuning.
FTTA builds a reliable multi-level consistency measurement of paired inputs for achieving self-supervised of prediction.
It was extensively validated on three large classification datasets with different modalities and organs.
arXiv Detail & Related papers (2023-06-05T02:29:38Z) - RegFormer: An Efficient Projection-Aware Transformer Network for
Large-Scale Point Cloud Registration [73.69415797389195]
We propose an end-to-end transformer network (RegFormer) for large-scale point cloud alignment.
Specifically, a projection-aware hierarchical transformer is proposed to capture long-range dependencies and filter outliers.
Our transformer has linear complexity, which guarantees high efficiency even for large-scale scenes.
arXiv Detail & Related papers (2023-03-22T08:47:37Z) - AdaPoinTr: Diverse Point Cloud Completion with Adaptive Geometry-Aware
Transformers [94.11915008006483]
We present a new method that reformulates point cloud completion as a set-to-set translation problem.
We design a new model, called PoinTr, which adopts a Transformer encoder-decoder architecture for point cloud completion.
Our method attains 6.53 CD on PCN, 0.81 CD on ShapeNet-55 and 0.392 MMD on real-world KITTI.
arXiv Detail & Related papers (2023-01-11T16:14:12Z) - Defect Transformer: An Efficient Hybrid Transformer Architecture for
Surface Defect Detection [2.0999222360659604]
We propose an efficient hybrid transformer architecture, termed Defect Transformer (DefT), for surface defect detection.
DefT incorporates CNN and transformer into a unified model to capture local and non-local relationships collaboratively.
Experiments on three datasets demonstrate the superiority and efficiency of our method compared with other CNN- and transformer-based networks.
arXiv Detail & Related papers (2022-07-17T23:37:48Z) - 3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification [23.0009969537045]
This paper presents a novel hierarchical framework that incorporates convolution with Transformer for point cloud classification.
Our method achieves state-of-the-art classification performance, in terms of both accuracy and efficiency.
arXiv Detail & Related papers (2022-03-02T02:42:14Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [81.71904691925428]
We present a new method that reformulates point cloud completion as a set-to-set translation problem.
We also design a new model, called PoinTr, that adopts a transformer encoder-decoder architecture for point cloud completion.
Our method outperforms state-of-the-art methods by a large margin on both the new benchmarks and the existing ones.
arXiv Detail & Related papers (2021-08-19T17:58:56Z) - SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation [71.2856098776959]
Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform.
We propose a novel architecture named Sparse Convolution-Transformer Network (SCTN) that equips the sparse convolution with the transformer.
We show that the learned relation-based contextual information is rich and helpful for matching corresponding points, benefiting scene flow estimation.
arXiv Detail & Related papers (2021-05-10T15:16:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.