Depth-Supervised Fusion Network for Seamless-Free Image Stitching
- URL: http://arxiv.org/abs/2510.21396v1
- Date: Fri, 24 Oct 2025 12:36:08 GMT
- Title: Depth-Supervised Fusion Network for Seamless-Free Image Stitching
- Authors: Zhiying Jiang, Ruhao Yan, Zengxi Zhang, Bowei Zhang, Jinyuan Liu,
- Abstract summary: Image stitching synthesizes images captured from multiple perspectives into a single image with a broader field of view.<n>The significant variations in object depth often lead to large parallax, resulting in ghosting and misalignment in the stitched results.<n>To address this, we propose a depth-consistency-constrained seamless-free image stitching method.
- Score: 17.758594808190264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image stitching synthesizes images captured from multiple perspectives into a single image with a broader field of view. The significant variations in object depth often lead to large parallax, resulting in ghosting and misalignment in the stitched results. To address this, we propose a depth-consistency-constrained seamless-free image stitching method. First, to tackle the multi-view alignment difficulties caused by parallax, a multi-stage mechanism combined with global depth regularization constraints is developed to enhance the alignment accuracy of the same apparent target across different depth ranges. Second, during the multi-view image fusion process, an optimal stitching seam is determined through graph-based low-cost computation, and a soft-seam region is diffused to precisely locate transition areas, thereby effectively mitigating alignment errors induced by parallax and achieving natural and seamless stitching results. Furthermore, considering the computational overhead in the shift regression process, a reparameterization strategy is incorporated to optimize the structural design, significantly improving algorithm efficiency while maintaining optimal performance. Extensive experiments demonstrate the superior performance of the proposed method against the existing methods. Code is available at https://github.com/DLUT-YRH/DSFN.
Related papers
- AngularFuse: A Closer Look at Angle-based Perception for Spatial-Sensitive Multi-Modality Image Fusion [54.84069863008752]
This paper proposes an angle-based perception framework for spatial-sensitive image fusion (AngularFuse)<n>By combining Laplacian edge enhancement with adaptive histogram, reference images with richer details and more balanced brightness are generated.<n>Experiments on the MSRS, RoadScene, and M3FD public datasets show that AngularFuse outperforms existing mainstream methods with clear margin.
arXiv Detail & Related papers (2025-10-14T08:13:15Z) - Auto-regressive transformation for image alignment [46.12916700236777]
Existing methods for image alignment struggle in cases involving feature-sparse regions, extreme scale and field-of-view differences, and large deformations.<n>We propose Auto-Regressive Transformation (ART), a novel method that iteratively estimates the coarse-to-fine transformations within an auto-regressive framework.<n>Our network refines the transformations using randomly sampled points at each scale.<n>By incorporating guidance from the cross-attention layer, the model focuses on critical regions, ensuring accurate alignment even in challenging, feature-limited conditions.
arXiv Detail & Related papers (2025-05-08T00:28:31Z) - Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution [52.55429225242423]
We propose a novel framework for Burst Image Super-Resolution (BISR), featuring an equivariant convolution-based alignment.<n>This enables the alignment transformation to be learned via explicit supervision in the image domain and easily applied in the feature domain.<n>Experiments on BISR benchmarks show the superior performance of our approach in both quantitative metrics and visual quality.
arXiv Detail & Related papers (2025-03-11T11:13:10Z) - Self-Supervised Multi-Scale Network for Blind Image Deblurring via Alternating Optimization [12.082424048578753]
We present a self-supervised multi-scale blind image deblurring method to jointly estimate the latent image and the blur kernel.
Thanks to the collaborative estimation across multiple scales, our method avoids the computationally intensive coarse-to-fine propagation and additional image deblurring processes.
arXiv Detail & Related papers (2024-09-02T07:08:17Z) - Fine Dense Alignment of Image Bursts through Camera Pose and Depth
Estimation [45.11207941777178]
This paper introduces a novel approach to the fine alignment of images in a burst captured by a handheld camera.
The proposed algorithm establishes dense correspondences by optimizing both the camera motion and surface depth and orientation at every pixel.
arXiv Detail & Related papers (2023-12-08T17:22:04Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - Unsupervised Light Field Depth Estimation via Multi-view Feature
Matching with Occlusion Prediction [15.421219881815956]
It is costly to obtain sufficient depth labels for supervised training.
In this paper, we propose an unsupervised framework to estimate depth from LF images.
arXiv Detail & Related papers (2023-01-20T06:11:17Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z) - Depth image denoising using nuclear norm and learning graph model [107.51199787840066]
Group-based image restoration methods are more effective in gathering the similarity among patches.
For each patch, we find and group the most similar patches within a searching window.
The proposed method is superior to other current state-of-the-art denoising methods in both subjective and objective criterion.
arXiv Detail & Related papers (2020-08-09T15:12:16Z) - Light Field Spatial Super-resolution via Deep Combinatorial Geometry
Embedding and Structural Consistency Regularization [99.96632216070718]
Light field (LF) images acquired by hand-held devices usually suffer from low spatial resolution.
The high-dimensional spatiality characteristic and complex geometrical structure of LF images make the problem more challenging than traditional single-image SR.
We propose a novel learning-based LF framework, in which each view of an LF image is first individually super-resolved.
arXiv Detail & Related papers (2020-04-05T14:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.