DPCN++: Differentiable Phase Correlation Network for Versatile Pose
Registration
- URL: http://arxiv.org/abs/2206.05707v1
- Date: Sun, 12 Jun 2022 10:00:34 GMT
- Title: DPCN++: Differentiable Phase Correlation Network for Versatile Pose
Registration
- Authors: Zexi Chen, Yiyi Liao, Haozhe Du, Haodong Zhang, Xuecheng Xu, Haojian
Lu, Rong Xiong, Yue Wang
- Abstract summary: We present a differentiable phase correlation solver that is globally convergent and correspondence-free.
We evaluate DCPN++ on a wide range of registration tasks taking different input modalities, including 2D bird's-eye view images, 3D object and scene measurements, and medical images.
- Score: 18.60311260250232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pose registration is critical in vision and robotics. This paper focuses on
the challenging task of initialization-free pose registration up to 7DoF for
homogeneous and heterogeneous measurements. While recent learning-based methods
show promise using differentiable solvers, they either rely on heuristically
defined correspondences or are prone to local minima. We present a
differentiable phase correlation (DPC) solver that is globally convergent and
correspondence-free. When combined with simple feature extraction networks, our
general framework DPCN++ allows for versatile pose registration with arbitrary
initialization. Specifically, the feature extraction networks first learn dense
feature grids from a pair of homogeneous/heterogeneous measurements. These
feature grids are then transformed into a translation and scale invariant
spectrum representation based on Fourier transform and spherical radial
aggregation, decoupling translation and scale from rotation. Next, the
rotation, scale, and translation are independently and efficiently estimated in
the spectrum step-by-step using the DPC solver. The entire pipeline is
differentiable and trained end-to-end. We evaluate DCPN++ on a wide range of
registration tasks taking different input modalities, including 2D bird's-eye
view images, 3D object and scene measurements, and medical images. Experimental
results demonstrate that DCPN++ outperforms both classical and learning-based
baselines, especially on partially observed and heterogeneous measurements.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - Double-Shot 3D Shape Measurement with a Dual-Branch Network [14.749887303860717]
We propose a dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet) to process different structured light (SL) modalities.
Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images.
We show that our method can reduce fringe order ambiguity while producing high-accuracy results on a self-made dataset.
arXiv Detail & Related papers (2024-07-19T10:49:26Z) - Multiway Point Cloud Mosaicking with Diffusion and Global Optimization [74.3802812773891]
We introduce a novel framework for multiway point cloud mosaicking (named Wednesday)
At the core of our approach is ODIN, a learned pairwise registration algorithm that identifies overlaps and refines attention scores.
Tested on four diverse, large-scale datasets, our method state-of-the-art pairwise and rotation registration results by a large margin on all benchmarks.
arXiv Detail & Related papers (2024-03-30T17:29:13Z) - Fully Differentiable Correlation-driven 2D/3D Registration for X-ray to CT Image Fusion [3.868072865207522]
Image-based rigid 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions.
We propose a novel fully differentiable correlation-driven network using a dual-branch CNN-transformer encoder.
A correlation-driven loss is proposed for low-frequency feature and high-frequency feature decomposition based on embedded information.
arXiv Detail & Related papers (2024-02-04T14:12:51Z) - Gappy local conformal auto-encoders for heterogeneous data fusion: in
praise of rigidity [6.1152340690876095]
We propose an end-to-end computational pipeline in the form of a multiple-auto-encoder neural network architecture for this task.
The inputs to the pipeline are several sets of partial observations, and the result is a globally consistent latent space, harmonizing (rigidifying, fusing) all measurements.
We demonstrate the approach in a sequence of examples, starting with simple two-dimensional data sets and proceeding to a Wi-Fi localization problem.
arXiv Detail & Related papers (2023-12-20T16:18:51Z) - Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - Explicit Correspondence Matching for Generalizable Neural Radiance
Fields [49.49773108695526]
We present a new NeRF method that is able to generalize to new unseen scenarios and perform novel view synthesis with as few as two source views.
The explicit correspondence matching is quantified with the cosine similarity between image features sampled at the 2D projections of a 3D point on different views.
Our method achieves state-of-the-art results on different evaluation settings, with the experiments showing a strong correlation between our learned cosine feature similarity and volume density.
arXiv Detail & Related papers (2023-04-24T17:46:01Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - Deep Phase Correlation for End-to-End Heterogeneous Sensor Measurements
Matching [12.93459392278491]
We present an end-to-end deep phase correlation network (DPCN) to match heterogeneous sensor measurements.
The primary component is a differentiable correlation-based estimator that back-propagates the pose error to learnable feature extractors.
With the interpretable modeling, the network is light-weighted and promising for better generalization.
arXiv Detail & Related papers (2020-08-21T13:42:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.