Related papers: EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration

EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration

URL: http://arxiv.org/abs/2509.07662v1
Date: Tue, 09 Sep 2025 12:30:51 GMT
Title: EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration
Authors: Haokai Zhu, Bo Qu, Si-Yuan Cao, Runmin Zhang, Shujie Chen, Bailin Yang, Hui-Liang Shen,
Abstract summary: We propose an Exponential-Decay Free-Form Deformation Network (EDFFDNet), which employs free-form deformation with an exponential-decay basis function.<n>By transforming dense interactions into sparse ones, ASMA reduces parameters and improves accuracy.<n>Experiments demonstrate that EDFFDNet reduces parameters, memory, and total runtime by 70.5%, 32.6%, and 33.7%, respectively.<n>EDFFDNet-2 further improves PSNR by 1.06 dB while maintaining lower computational costs.
Score: 17.190325630307097
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous deep image registration methods that employ single homography, multi-grid homography, or thin-plate spline often struggle with real scenes containing depth disparities due to their inherent limitations. To address this, we propose an Exponential-Decay Free-Form Deformation Network (EDFFDNet), which employs free-form deformation with an exponential-decay basis function. This design achieves higher efficiency and performs well in scenes with depth disparities, benefiting from its inherent locality. We also introduce an Adaptive Sparse Motion Aggregator (ASMA), which replaces the MLP motion aggregator used in previous methods. By transforming dense interactions into sparse ones, ASMA reduces parameters and improves accuracy. Additionally, we propose a progressive correlation refinement strategy that leverages global-local correlation patterns for coarse-to-fine motion estimation, further enhancing efficiency and accuracy. Experiments demonstrate that EDFFDNet reduces parameters, memory, and total runtime by 70.5%, 32.6%, and 33.7%, respectively, while achieving a 0.5 dB PSNR gain over the state-of-the-art method. With an additional local refinement stage,EDFFDNet-2 further improves PSNR by 1.06 dB while maintaining lower computational costs. Our method also demonstrates strong generalization ability across datasets, outperforming previous deep learning methods.

Related papers

Event-based Visual Deformation Measurement [76.25283405575108]
Visual Deformation Measurement aims to recover dense deformation fields by tracking surface motion from camera observations.<n>Traditional image-based methods rely on minimal inter-frame motion to constrain the correspondence search space.<n>We propose an event-frame fusion framework that exploits events for temporally dense motion cues and frames for spatially dense precise estimation.
arXiv Detail & Related papers (2026-02-16T01:04:48Z)
Motion-Aware Adaptive Pixel Pruning for Efficient Local Motion Deblurring [87.56382172827526]
We propose a trainable mask predictor that identifies blurred regions in the image.<n>We also develop an intra-frame motion analyzer that translates relative pixel displacements into motion trajectories.<n>Our method is trained end-to-end using a combination of reconstruction loss, reblur loss, and mask loss guided by annotated blur masks.
arXiv Detail & Related papers (2025-07-10T12:38:27Z)
Progressive Alignment Degradation Learning for Pansharpening [3.7939736380306552]
Deep learning-based pansharpening has been shown to effectively generate high-resolution multispectral (HRMS) images.<n>The Wald protocol assumes that networks trained on artificial low-resolution data will perform equally well on high-resolution data.<n>We proposePADM, which uses mutual iteration between two sub-networks, PAlignNet and PDegradeNet, to adaptively learn accurate degradation processes.
arXiv Detail & Related papers (2025-06-25T07:07:32Z)
Flow-GRPO: Training Flow Matching Models via Online RL [75.70017261794422]
We propose Flow-GRPO, the first method integrating online reinforcement learning (RL) into flow matching models.<n>Our approach uses two key strategies: (1) an ODE-to-SDE conversion that transforms a deterministic Ordinary Equation (ODE) into an equivalent Differential Equation (SDE) that matches the original model's marginal distribution at all timesteps; and (2) a Denoising Reduction strategy that reduces training denoising steps while retaining the original inference timestep number.
arXiv Detail & Related papers (2025-05-08T17:58:45Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.<n>Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization [14.23697277904244]
We present Reweighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample re-weighting. We demonstrate the effectiveness of RGD on various learning tasks, including supervised learning, meta-learning, and out-of-domain generalization.
arXiv Detail & Related papers (2023-06-15T15:58:04Z)
Learned Image Compression with Generalized Octave Convolution and Cross-Resolution Parameter Estimation [5.238765582868391]
We propose a learned multi-resolution image compression framework, which exploits octave convolutions to factorize the latent representations into the high-resolution (HR) and low-resolution (LR) parts. Experimental results show that our method separately reduces the decoding time by approximately 73.35 % and 93.44 % compared with that of state-of-the-art learned image compression methods.
arXiv Detail & Related papers (2022-09-07T08:21:52Z)
FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose. We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence. Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.