SemanticStitch: Enhancing Image Coherence through Foreground-Aware Seam Carving
- URL: http://arxiv.org/abs/2511.12084v2
- Date: Fri, 21 Nov 2025 04:13:45 GMT
- Title: SemanticStitch: Enhancing Image Coherence through Foreground-Aware Seam Carving
- Authors: Ji-Ping Jin, Chen-Bin Feng, Rui Fan, Chi-Man Vong,
- Abstract summary: Traditional seam carving methods neglect semantic information, causing disruptions in foreground continuity.<n>We introduce SemanticStitch, a deep learning-based framework that incorporates semantic priors of foreground objects to preserve their integrity and enhance visual coherence.<n>Our approach includes a novel loss function that emphasizes the semantic integrity of salient objects, significantly improving stitching quality.
- Score: 16.875629105210695
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Image stitching often faces challenges due to varying capture angles, positional differences, and object movements, leading to misalignments and visual discrepancies. Traditional seam carving methods neglect semantic information, causing disruptions in foreground continuity. We introduce SemanticStitch, a deep learning-based framework that incorporates semantic priors of foreground objects to preserve their integrity and enhance visual coherence. Our approach includes a novel loss function that emphasizes the semantic integrity of salient objects, significantly improving stitching quality. We also present two specialized real-world datasets to evaluate our method's effectiveness. Experimental results demonstrate substantial improvements over traditional techniques, providing robust support for practical applications.
Related papers
- Comparison Reveals Commonality: Customized Image Generation through Contrastive Inversion [22.481176245267328]
We propose Contrastive Inversion, a novel approach that identifies the common concept by comparing the input images without relying on additional information.<n>We train the target token along with the image-wise auxiliary text tokens via contrastive learning, which extracts the well-disentangled true semantics of the target.
arXiv Detail & Related papers (2025-08-11T08:36:29Z) - Style Transfer: From Stitching to Neural Networks [5.539031975261105]
This article compares two style transfer methods in image processing.
The traditional method synthesizes new images by stitching together small patches from existing images, and a modern machine learning-based approach that uses a segmentation network to isolate foreground objects and apply style transfer solely to the background.
arXiv Detail & Related papers (2024-09-01T04:07:03Z) - TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization [59.412236435627094]
TALE is a training-free framework harnessing the generative capabilities of text-to-image diffusion models.
We equip TALE with two mechanisms dubbed Adaptive Latent Manipulation and Energy-guided Latent Optimization.
Our experiments demonstrate that TALE surpasses prior baselines and attains state-of-the-art performance in image-guided composition.
arXiv Detail & Related papers (2024-08-07T08:52:21Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Take a Prior from Other Tasks for Severe Blur Removal [52.380201909782684]
Cross-level feature learning strategy based on knowledge distillation to learn the priors.
Semantic prior embedding layer with multi-level aggregation and semantic attention transformation to integrate the priors effectively.
Experiments on natural image deblurring benchmarks and real-world images, such as GoPro and RealBlur datasets, demonstrate our method's effectiveness and ability.
arXiv Detail & Related papers (2023-02-14T08:30:51Z) - Image Inpainting Guided by Coherence Priors of Semantics and Textures [62.92586889409379]
We introduce coherence priors between the semantics and textures which make it possible to concentrate on completing separate textures in a semantic-wise manner.
We also propose two coherence losses to constrain the consistency between the semantics and the inpainted image in terms of the overall structure and detailed textures.
arXiv Detail & Related papers (2020-12-15T02:59:37Z) - Learning Edge-Preserved Image Stitching from Large-Baseline Deep
Homography [32.28310831466225]
We propose an image stitching learning framework, which consists of a large-baseline deep homography module and an edge-preserved deformation module.
Our method is superior to the existing learning method and shows competitive performance with state-of-the-art traditional methods.
arXiv Detail & Related papers (2020-12-11T08:43:30Z) - Rethinking of the Image Salient Object Detection: Object-level Semantic
Saliency Re-ranking First, Pixel-wise Saliency Refinement Latter [62.26677215668959]
We propose a lightweight, weakly supervised deep network to coarsely locate semantically salient regions.
We then fuse multiple off-the-shelf deep models on these semantically salient regions as the pixel-wise saliency refinement.
Our method is simple yet effective, which is the first attempt to consider the salient object detection mainly as an object-level semantic re-ranking problem.
arXiv Detail & Related papers (2020-08-10T07:12:43Z) - Unsupervised Learning of Landmarks based on Inter-Intra Subject
Consistencies [72.67344725725961]
We present a novel unsupervised learning approach to image landmark discovery by incorporating the inter-subject landmark consistencies on facial images.
This is achieved via an inter-subject mapping module that transforms original subject landmarks based on an auxiliary subject-related structure.
To recover from the transformed images back to the original subject, the landmark detector is forced to learn spatial locations that contain the consistent semantic meanings both for the paired intra-subject images and between the paired inter-subject images.
arXiv Detail & Related papers (2020-04-16T20:38:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.