Related papers: TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models

TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models

URL: http://arxiv.org/abs/2404.19475v4
Date: Sat, 6 Jul 2024 11:16:46 GMT
Title: TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models
Authors: Teng Zhou, Yongchuan Tang,
Abstract summary: Diffusion models have emerged as effective tools for generating diverse and high-quality content. They face challenges such as visible seams and incoherent transitions. We propose TwinDiffusion, an optimized framework designed to address these challenges.
Score: 3.167554518801207
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have emerged as effective tools for generating diverse and high-quality content. However, their capability in high-resolution image generation, particularly for panoramic images, still faces challenges such as visible seams and incoherent transitions. In this paper, we propose TwinDiffusion, an optimized framework designed to address these challenges through two key innovations: the Crop Fusion for quality enhancement and the Cross Sampling for efficiency optimization. We introduce a training-free optimizing stage to refine the similarity of adjacent image areas, as well as an interleaving sampling strategy to yield dynamic patches during the cropping process. A comprehensive evaluation is conducted to compare TwinDiffusion with the prior works, considering factors including coherence, fidelity, compatibility, and efficiency. The results demonstrate the superior performance of our approach in generating seamless and coherent panoramas, setting a new standard in quality and efficiency for panoramic image generation.

Related papers

FusionFM: All-in-One Multi-Modal Image Fusion with Flow Matching [42.22268167379098]
We formulate image fusion as a direct probabilistic transport from source modalities to the fused image distribution.<n>We employ a task-aware selection function to select the most reliable pseudo-labels for each task.<n>For multi-task scenarios, we integrate elastic weight consolidation and experience replay mechanisms to preserve cross-task performance.
arXiv Detail & Related papers (2025-11-17T02:56:48Z)
VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis [23.50866105623598]
We propose a diffusion-based framework that synthesizes a single, coherent object by integrating two input images at both noise and latent levels.<n>Our method outperforms strong baselines in visual quality, semantic consistency, and human-rated creativity.
arXiv Detail & Related papers (2025-09-28T03:17:58Z)
Two-Stage Random Alternation Framework for One-Shot Pansharpening [12.385955231193675]
We introduce a two-stage random alternating framework (TRA-PAN) that performs instance-specific optimization for any given Multispectral(MS)/Panchromatic(PAN) pair.<n>TRA-PAN effectively integrates strong supervision constraints from reduced-resolution images with the physical characteristics of the full-resolution images.<n> Experimental results demonstrate that TRA-PAN outperforms state-of-the-art (SOTA) methods in quantitative metrics and visual quality in real-world scenarios.
arXiv Detail & Related papers (2025-05-10T09:26:22Z)
Efficient Training-Free High-Resolution Synthesis with Energy Rectification in Diffusion Models [29.69501919628436]
Diffusion models have achieved remarkable progress across various visual generation tasks.<n>However, their performance significantly declines when generating content at resolutions higher than those used during training.<n>We propose RectifiedHR, a solution for training-free high-resolution synthesis.
arXiv Detail & Related papers (2025-03-04T12:03:26Z)
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step [77.86514804787622]
Chain-of-Thought (CoT) reasoning has been extensively explored in large models to tackle complex understanding tasks. We provide the first comprehensive investigation of the potential of CoT reasoning to enhance autoregressive image generation. We propose the Potential Assessment Reward Model (PARM) and PARM++, specialized for autoregressive image generation.
arXiv Detail & Related papers (2025-01-23T18:59:43Z)
SpotActor: Training-Free Layout-Controlled Consistent Image Generation [43.2870588035256]
We present a new formalization of dual energy guidance with optimization in a dual semantic-latent space. We propose a training-free pipeline, SpotActor, which features a layout-conditioned backward update stage and a consistent forward sampling stage. The results prove that SpotActor fulfills the expectations of this task and showcases the potential for practical applications.
arXiv Detail & Related papers (2024-09-07T11:52:48Z)
ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps. We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z)
SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time [7.532695984765271]
We present a novel approach to generate high-resolution images with generative models. Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next. Our method offers several key benefits, including improved computational efficiency and faster inference times.
arXiv Detail & Related papers (2024-07-22T09:44:35Z)
Coherent and Multi-modality Image Inpainting via Latent Space Optimization [61.99406669027195]
PILOT (intextbfPainting vtextbfIa textbfLatent textbfOptextbfTimization) is an optimization approach grounded on a novel textitsemantic centralization and textitbackground preservation loss. Our method searches latent spaces capable of generating inpainted regions that exhibit high fidelity to user-provided prompts while maintaining coherence with the background.
arXiv Detail & Related papers (2024-07-10T19:58:04Z)
HySim: An Efficient Hybrid Similarity Measure for Patch Matching in Image Inpainting [0.0]
Inpainting, for filling missing image regions, is a crucial task in various applications, such as medical imaging and remote sensing. This paper proposes an improved modeldriven approach relying on patch-based techniques. Our approach deviates from the standard Sum of Squared Differences (SSD) similarity measure by introducing a Hybrid Similarity (HySim)
arXiv Detail & Related papers (2024-03-21T10:59:44Z)
PASTA: Towards Flexible and Efficient HDR Imaging Via Progressively Aggregated Spatio-Temporal Alignment [91.38256332633544]
PASTA is a Progressively Aggregated Spatio-Temporal Alignment framework for HDR deghosting. Our approach achieves effectiveness and efficiency by harnessing hierarchical representation during feature distanglement. Experimental results showcase PASTA's superiority over current SOTA methods in both visual quality and performance metrics.
arXiv Detail & Related papers (2024-03-15T15:05:29Z)
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers [41.78970081787674]
We propose a more efficient two-stage framework for high-resolution image generation. We employ a local attention-based quantization model instead of the global attention mechanism used in previous methods. This approach results in faster generation speed, higher generation fidelity, and improved resolution.
arXiv Detail & Related papers (2023-10-09T04:38:52Z)
Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion [60.221404321514086]
Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels. This paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions.
arXiv Detail & Related papers (2023-09-03T08:07:26Z)
Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation [15.411325887412413]
This paper proposes a novel model named "Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment" (FSA-CDM) FSA-CDM introduces contrastive positive/negative samples into the diffusion model to boost performance for markup-to-image generation. Experiments are conducted on four benchmark datasets from different domains.
arXiv Detail & Related papers (2023-08-02T13:43:03Z)
Real-World Image Variation by Aligning Diffusion Inversion Chain [53.772004619296794]
A domain gap exists between generated images and real-world images, which poses a challenge in generating high-quality variations of real-world images. We propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL) Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
arXiv Detail & Related papers (2023-05-30T04:09:47Z)
IRGen: Generative Modeling for Image Retrieval [82.62022344988993]
In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling. We develop our model, dubbed IRGen, to address the technical challenge of converting an image into a concise sequence of semantic units. Our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks and two million-scale datasets.
arXiv Detail & Related papers (2023-03-17T17:07:36Z)
Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation. It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression. Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.