SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation
- URL: http://arxiv.org/abs/2409.11682v2
- Date: Thu, 3 Oct 2024 13:03:39 GMT
- Title: SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation
- Authors: Mingze Sun, Chen Guo, Puhua Jiang, Shiwei Mao, Yurun Chen, Ruqi Huang,
- Abstract summary: We propose SRIF, a novel Semantic shape Registration framework based on diffusion-based Image morphing and flow estimation.
SRIF achieves high-quality dense correspondences on challenging shape pairs, but also delivers smooth, semantically meaningful in between.
- Score: 2.336821026049481
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we propose SRIF, a novel Semantic shape Registration framework based on diffusion-based Image morphing and Flow estimation. More concretely, given a pair of extrinsically aligned shapes, we first render them from multi-views, and then utilize an image interpolation framework based on diffusion models to generate sequences of intermediate images between them. The images are later fed into a dynamic 3D Gaussian splatting framework, with which we reconstruct and post-process for intermediate point clouds respecting the image morphing processing. In the end, tailored for the above, we propose a novel registration module to estimate continuous normalizing flow, which deforms source shape consistently towards the target, with intermediate point clouds as weak guidance. Our key insight is to leverage large vision models (LVMs) to associate shapes and therefore obtain much richer semantic information on the relationship between shapes than the ad-hoc feature extraction and alignment. As a consequence, SRIF achieves high-quality dense correspondences on challenging shape pairs, but also delivers smooth, semantically meaningful interpolation in between. Empirical evidence justifies the effectiveness and superiority of our method as well as specific design choices. The code is released at https://github.com/rqhuang88/SRIF.
Related papers
- DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion [35.60459492849359]
We study the problem of generating intermediate images from image pairs with large motion.
Due to the large motion, the intermediate semantic information may be absent in input images.
We propose DreamMover, a novel image framework with three main components.
arXiv Detail & Related papers (2024-09-15T04:09:12Z) - RecDiffusion: Rectangling for Image Stitching with Diffusion Models [53.824503710254206]
We introduce a novel diffusion-based learning framework, textbfRecDiffusion, for image stitching rectangling.
This framework combines Motion Diffusion Models (MDM) to generate motion fields, effectively transitioning from the stitched image's irregular borders to a geometrically corrected intermediary.
arXiv Detail & Related papers (2024-03-28T06:22:45Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - Non-Rigid Shape Registration via Deep Functional Maps Prior [1.9249120068573227]
We propose a learning-based framework for non-rigid shape registration without correspondence supervision.
We deform source mesh towards the target point cloud, guided by correspondences induced by high-dimensional embeddings.
Our pipeline achieves state-of-the-art results on several benchmarks of non-rigid point cloud matching.
arXiv Detail & Related papers (2023-11-08T06:52:57Z) - FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators [37.39693977657165]
Matching cross-modality features between images and point clouds is a fundamental problem for image-to-point cloud registration.
We propose to unify the modality between images and point clouds by pretrained large-scale models first.
We show that the intermediate features, called diffusion features, extracted by depth-to-image diffusion models are semantically consistent between images and point clouds.
arXiv Detail & Related papers (2023-10-05T09:57:23Z) - Improving Misaligned Multi-modality Image Fusion with One-stage
Progressive Dense Registration [67.23451452670282]
Misalignments between multi-modality images pose challenges in image fusion.
We propose a Cross-modality Multi-scale Progressive Dense Registration scheme.
This scheme accomplishes the coarse-to-fine registration exclusively using a one-stage optimization.
arXiv Detail & Related papers (2023-08-22T03:46:24Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline
Model and DoF-based Curriculum Learning [62.86400614141706]
We propose a new learning model, i.e., Rectangling Rectification Network (RecRecNet)
Our model can flexibly warp the source structure to the target domain and achieves an end-to-end unsupervised deformation.
Experiments show the superiority of our solution over the compared methods on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2023-01-04T15:12:57Z) - Image Morphing with Perceptual Constraints and STN Alignment [70.38273150435928]
We propose a conditional GAN morphing framework operating on a pair of input images.
A special training protocol produces sequences of frames, combined with a perceptual similarity loss, promote smooth transformation over time.
We provide comparisons to classic as well as latent space morphing techniques, and demonstrate that, given a set of images for self-supervision, our network learns to generate visually pleasing morphing effects.
arXiv Detail & Related papers (2020-04-29T10:49:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.