Scale and Rotation Estimation of Similarity-Transformed Images via Cross-Correlation Maximization Based on Auxiliary Function Method
- URL: http://arxiv.org/abs/2509.22686v1
- Date: Thu, 18 Sep 2025 03:22:13 GMT
- Title: Scale and Rotation Estimation of Similarity-Transformed Images via Cross-Correlation Maximization Based on Auxiliary Function Method
- Authors: Shinji Yamashita, Yuma Kinoshita, Hitoshi Kiya,
- Abstract summary: This paper introduces a highly efficient algorithm capable jointly estimating scale and rotation between two images with sub-pixel precision.<n>Image alignment serves as a critical process for spatially registering images captured from different viewpoints.
- Score: 2.192057999494216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a highly efficient algorithm capable of jointly estimating scale and rotation between two images with sub-pixel precision. Image alignment serves as a critical process for spatially registering images captured from different viewpoints, and finds extensive use in domains such as medical imaging and computer vision. Traditional phase-correlation techniques are effective in determining translational shifts; however, they are inadequate when addressing scale and rotation changes, which often arise due to camera zooming or rotational movements. In this paper, we propose a novel algorithm that integrates scale and rotation estimation based on the Fourier transform in log-polar coordinates with a cross-correlation maximization strategy, leveraging the auxiliary function method. By incorporating sub-pixel-level cross-correlation our method enables precise estimation of both scale and rotation. Experimental results demonstrate that the proposed method achieves lower mean estimation errors for scale and rotation than conventional Fourier transform-based techniques that rely on discrete cross-correlation.
Related papers
- Resolution-Independent Neural Operators for Multi-Rate Sparse-View CT [67.14700058302016]
Deep learning methods achieve high-fidelity reconstructions but often overfit to a fixed acquisition setup.<n>We propose Computed Tomography neural Operator (CTO), a unified CT reconstruction framework that extends to continuous function space.<n>CTO enables consistent multi-sampling-rate and cross-resolution performance, with on average >4dB PSNR gain over CNNs.
arXiv Detail & Related papers (2025-12-13T08:31:46Z) - Rotated Mean-Field Variational Inference and Iterative Gaussianization [11.954133194037858]
We propose to perform mean-field variational inference (MFVI) in a rotated coordinate system.<n>MFVI in a rotated coordinate system defines a rotation and a coordinatewise map that together move the target closer to Gaussian.<n>Iterating this procedure yields a sequence of transformations that progressively transforms the target toward Gaussian.
arXiv Detail & Related papers (2025-10-09T03:13:44Z) - Exploring Kernel Transformations for Implicit Neural Representations [57.2225355625268]
Implicit neural representations (INRs) leverage neural networks to represent signals by mapping coordinates to their corresponding attributes.<n>This work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged.<n>A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible overhead.
arXiv Detail & Related papers (2025-04-07T04:43:50Z) - Verification of Geometric Robustness of Neural Networks via Piecewise Linear Approximation and Lipschitz Optimisation [57.10353686244835]
We address the problem of verifying neural networks against geometric transformations of the input image, including rotation, scaling, shearing, and translation.
The proposed method computes provably sound piecewise linear constraints for the pixel values by using sampling and linear approximations in combination with branch-and-bound Lipschitz.
We show that our proposed implementation resolves up to 32% more verification cases than present approaches.
arXiv Detail & Related papers (2024-08-23T15:02:09Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Estimating Extreme 3D Image Rotation with Transformer Cross-Attention [13.82735766201496]
We propose a cross-attention-based approach that utilizes CNN feature maps and a Transformer-Encoder to compute the cross-attention between the activation maps of the image pairs.
It is experimentally shown to outperform contemporary state-of-the-art schemes when applied to commonly used image rotation datasets and benchmarks.
arXiv Detail & Related papers (2023-03-05T09:07:26Z) - Motion Estimation for Large Displacements and Deformations [7.99536002595393]
Variational optical flow techniques based on a coarse-to-fine scheme interpolate sparse matches and locally optimize an energy model conditioned on colour, gradient and smoothness.
This paper addresses this problem and presents HybridFlow, a variational motion estimation framework for large displacements and deformations.
arXiv Detail & Related papers (2022-06-24T18:53:22Z) - Data-Driven Interpolation for Super-Scarce X-Ray Computed Tomography [1.3535770763481902]
We train shallow neural networks to combine two neighbouring acquisitions into an estimated measurement at an intermediate angle.
This yields an enhanced sequence of measurements that can be reconstructed using standard methods.
Results are obtained for 2D and 3D imaging, on large biomedical datasets.
arXiv Detail & Related papers (2022-05-16T15:42:41Z) - Dual-Flow Transformation Network for Deformable Image Registration with
Region Consistency Constraint [95.30864269428808]
Current deep learning (DL)-based image registration approaches learn the spatial transformation from one image to another by leveraging a convolutional neural network.
We present a novel dual-flow transformation network with region consistency constraint which maximizes the similarity of ROIs within a pair of images.
Experiments on four public 3D MRI datasets show that the proposed method achieves the best registration performance in accuracy and generalization.
arXiv Detail & Related papers (2021-12-04T05:30:44Z) - Learning Discriminative Shrinkage Deep Networks for Image Deconvolution [122.79108159874426]
We propose an effective non-blind deconvolution approach by learning discriminative shrinkage functions to implicitly model these terms.
Experimental results show that the proposed method performs favorably against the state-of-the-art ones in terms of efficiency and accuracy.
arXiv Detail & Related papers (2021-11-27T12:12:57Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.