Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration
- URL: http://arxiv.org/abs/2011.12104v2
- Date: Tue, 13 Apr 2021 08:07:17 GMT
- Title: Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration
- Authors: Wanquan Feng, Juyong Zhang, Hongrui Cai, Haofei Xu, Junhui Hou and
Hujun Bao
- Abstract summary: Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data.
We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations.
We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
- Score: 79.72086524370819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning non-rigid registration in an end-to-end manner is challenging due to
the inherent high degrees of freedom and the lack of labeled training data. In
this paper, we resolve these two challenges simultaneously. First, we propose
to represent the non-rigid transformation with a point-wise combination of
several rigid transformations. This representation not only makes the solution
space well-constrained but also enables our method to be solved iteratively
with a recurrent framework, which greatly reduces the difficulty of learning.
Second, we introduce a differentiable loss function that measures the 3D shape
similarity on the projected multi-view 2D depth images so that our full
framework can be trained end-to-end without ground truth supervision. Extensive
experiments on several different datasets demonstrate that our proposed method
outperforms the previous state-of-the-art by a large margin. The source codes
are available at https://github.com/WanquanF/RMA-Net.
Related papers
- Cross-domain and Cross-dimension Learning for Image-to-Graph
Transformers [50.576354045312115]
Direct image-to-graph transformation is a challenging task that solves object detection and relationship prediction in a single model.
We introduce a set of methods enabling cross-domain and cross-dimension transfer learning for image-to-graph transformers.
We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we pretrain our models on 2D satellite images before applying them to vastly different target domains in 2D and 3D.
arXiv Detail & Related papers (2024-03-11T10:48:56Z) - 2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images [12.076881343401329]
We present a novel two-stage algorithm, 2S-UDF, for learning a high-quality UDF from multi-view images.
In both quantitative metrics and visual quality, the results indicate our superior performance over other UDF learning techniques.
arXiv Detail & Related papers (2023-03-27T16:35:28Z) - RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline
Model and DoF-based Curriculum Learning [62.86400614141706]
We propose a new learning model, i.e., Rectangling Rectification Network (RecRecNet)
Our model can flexibly warp the source structure to the target domain and achieves an end-to-end unsupervised deformation.
Experiments show the superiority of our solution over the compared methods on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2023-01-04T15:12:57Z) - PatchMVSNet: Patch-wise Unsupervised Multi-View Stereo for
Weakly-Textured Surface Reconstruction [2.9896482273918434]
This paper proposes robust loss functions leveraging constraints beneath multi-view images to alleviate matching ambiguity.
Our strategy can be implemented with arbitrary depth estimation frameworks and can be trained with arbitrary large-scale MVS datasets.
Our method reaches the performance of the state-of-the-art methods on popular benchmarks, like DTU, Tanks and Temples and ETH3D.
arXiv Detail & Related papers (2022-03-04T07:05:23Z) - Multi-Objective Dual Simplex-Mesh Based Deformable Image Registration
for 3D Medical Images -- Proof of Concept [0.7734726150561088]
This work introduces the first method for multi-objective 3D deformable image registration, using a 3D dual-dynamic grid transformation model based on simplex meshes.
Our proof-of-concept prototype shows promising results on synthetic and clinical 3D registration problems.
arXiv Detail & Related papers (2022-02-22T16:07:29Z) - Multi-initialization Optimization Network for Accurate 3D Human Pose and
Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION)
In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample.
In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.
Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - Gram Regularization for Multi-view 3D Shape Retrieval [3.655021726150368]
We propose a novel regularization term called Gram regularization.
By forcing the variance between weight kernels to be large, the regularizer can help to extract discriminative features.
The proposed Gram regularization is data independent and can converge stably and quickly without bells and whistles.
arXiv Detail & Related papers (2020-11-16T05:37:24Z) - Monocular, One-stage, Regression of Multiple 3D People [105.3143785498094]
We propose to Regress all meshes in a One-stage fashion for Multiple 3D People (termed ROMP)
Our method simultaneously predicts a Body Center heatmap and a Mesh map, which can jointly describe the 3D body mesh on the pixel level.
Compared with state-of-the-art methods, ROMP superior performance on the challenging multi-person benchmarks.
arXiv Detail & Related papers (2020-08-27T17:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.