Seamlessly Natural: Image Stitching with Natural Appearance Preservation
- URL: http://arxiv.org/abs/2601.01257v1
- Date: Sat, 03 Jan 2026 18:40:35 GMT
- Title: Seamlessly Natural: Image Stitching with Natural Appearance Preservation
- Authors: Gaetane Lorna N. Tchana, Damaris Belle M. Fotso, Antonio Hendricks, Christophe Bobda,
- Abstract summary: SENA prioritizes structural fidelity in challenging real-world scenes characterized by parallax and depth variation.<n> SENA addresses fundamental limitations through three key contributions.<n>Experiments conducted on challenging datasets demonstrate that SENA achieves alignment accuracy comparable to leading homography-based methods.
- Score: 0.6089774484591287
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces SENA (SEamlessly NAtural), a geometry-driven image stitching approach that prioritizes structural fidelity in challenging real-world scenes characterized by parallax and depth variation. Conventional image stitching relies on homographic alignment, but this rigid planar assumption often fails in dual-camera setups with significant scene depth, leading to distortions such as visible warps and spherical bulging. SENA addresses these fundamental limitations through three key contributions. First, we propose a hierarchical affine-based warping strategy, combining global affine initialization with local affine refinement and smooth free-form deformation. This design preserves local shape, parallelism, and aspect ratios, thereby avoiding the hallucinated structural distortions commonly introduced by homography-based models. Second, we introduce a geometry-driven adequate zone detection mechanism that identifies parallax-minimized regions directly from the disparity consistency of RANSAC-filtered feature correspondences, without relying on semantic segmentation. Third, building upon this adequate zone, we perform anchor-based seamline cutting and segmentation, enforcing a one-to-one geometric correspondence across image pairs by construction, which effectively eliminates ghosting, duplication, and smearing artifacts in the final panorama. Extensive experiments conducted on challenging datasets demonstrate that SENA achieves alignment accuracy comparable to leading homography-based methods, while significantly outperforming them in critical visual metrics such as shape preservation, texture integrity, and overall visual realism.
Related papers
- LiftProj: Space Lifting and Projection-Based Panorama Stitching [11.757651376730509]
This study introduces a spatially lifted panoramic stitching framework.<n>A unified projection center is established in three-dimensional space, and an equidistant cylindrical projection is employed to map the fused data onto a single panoramic manifold.<n> hole filling is conducted within the canvas domain to address unknown regions revealed by viewpoint transitions.
arXiv Detail & Related papers (2025-12-30T15:03:38Z) - Coarse-to-Fine Non-Rigid Registration for Side-Scan Sonar Mosaicking [0.1631115063641726]
Side-scan sonar mosaicking plays a crucial role in large-scale seabed mapping.<n>Existing rigid or affine registration methods fail to model complex deformations.<n>We propose a coarse-to-fine hierarchical non-rigid registration framework tailored for large-scale side-scan sonar images.
arXiv Detail & Related papers (2025-11-19T12:44:31Z) - G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior [53.762256749551284]
We identify accurate geometry as the fundamental prerequisite for effectively exploiting generative models to enhance 3D scene reconstruction.<n>We incorporate this geometry guidance throughout the generative pipeline to improve visibility mask estimation, guide novel view selection, and enhance multi-view consistency when inpainting with video diffusion models.<n>Our method naturally supports single-view inputs and unposed videos, with strong generalizability in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2025-10-14T03:06:28Z) - Dense Semantic Matching with VGGT Prior [49.42199006453071]
We propose an approach that retains VGGT's intrinsic strengths by reusing early feature stages, fine-tuning later ones, and adding a semantic head for bidirectional correspondences.<n>Our approach achieves superior geometry awareness, matching reliability, and manifold preservation, outperforming previous baselines.
arXiv Detail & Related papers (2025-09-25T14:56:11Z) - Gaussian Alignment for Relative Camera Pose Estimation via Single-View Reconstruction [18.936573991468926]
GARPS is a training-free framework that casts this problem as the direct alignment of two independently reconstructed 3D scenes.<n>It refines an initial pose from a feed-forward two-view pose estimator by optimising a differentiable GMM alignment objective.<n>Experiments on the Real-Estate10K dataset demonstrate that GARPS outperforms both classical and state-of-the-art learning-based methods.
arXiv Detail & Related papers (2025-09-17T02:57:34Z) - DVP-MVS++: Synergize Depth-Normal-Edge and Harmonized Visibility Prior for Multi-View Stereo [7.544716770845737]
We propose DVP-MVS++, an innovative approach that synergizes both depth-normal-edge aligned and harmonized cross-view priors for robust and visibility-aware patch deformation.<n> Evaluation results on ETH3D, Tanks & Temples and Strecha datasets exhibit the state-of-the-art performance and robust generalization capability of our proposed method.
arXiv Detail & Related papers (2025-06-16T08:15:22Z) - Geometry-Editable and Appearance-Preserving Object Compositon [67.98806888489385]
General object composition (GOC) aims to seamlessly integrate a target object into a background scene with desired geometric properties.<n>Recent approaches derive semantic embeddings and integrate them into advanced diffusion models to enable geometry-editable generation.<n>We introduce a Disentangled Geometry-editable and Appearance-preserving Diffusion model that first leverages semantic embeddings to implicitly capture desired geometric transformations.
arXiv Detail & Related papers (2025-05-27T09:05:28Z) - AlignDiff: Learning Physically-Grounded Camera Alignment via Diffusion [0.5277756703318045]
We introduce a novel framework that addresses camera intrinsic and extrinsic parameters using a generic ray camera model.<n>Unlike previous approaches, AlignDiff shifts focus from semantic to geometric features, enabling more accurate modeling of local distortions.<n>Our experiments demonstrate that the proposed method significantly reduces the angular error of estimated ray bundles by 8.2 degrees and overall calibration accuracy, outperforming existing approaches on challenging, real-world datasets.
arXiv Detail & Related papers (2025-03-27T14:59:59Z) - ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction [50.07671826433922]
It is non-trivial to simultaneously recover meticulous geometry and preserve smoothness across regions with differing characteristics.<n>We propose ND-SDF, which learns a Normal Deflection field to represent the angular deviation between the scene normal and the prior normal.<n>Our method not only obtains smooth weakly textured regions such as walls and floors but also preserves the geometric details of complex structures.
arXiv Detail & Related papers (2024-08-22T17:59:01Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - Light Field Spatial Super-resolution via Deep Combinatorial Geometry
Embedding and Structural Consistency Regularization [99.96632216070718]
Light field (LF) images acquired by hand-held devices usually suffer from low spatial resolution.
The high-dimensional spatiality characteristic and complex geometrical structure of LF images make the problem more challenging than traditional single-image SR.
We propose a novel learning-based LF framework, in which each view of an LF image is first individually super-resolved.
arXiv Detail & Related papers (2020-04-05T14:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.