AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
- URL: http://arxiv.org/abs/2407.10738v2
- Date: Thu, 18 Jul 2024 04:19:53 GMT
- Title: AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
- Authors: Zhihang Lin, Mingbao Lin, Meng Zhao, Rongrong Ji,
- Abstract summary: We propose AccDiffusion, an accurate method for patch-wise higher-resolution image generation without training.
An in-depth analysis in this paper reveals an identical text prompt for different patches causes repeated object generation.
Our AccDiffusion, for the first time, proposes to decouple the vanilla image-content-aware prompt into a set of patch-content-aware prompts.
- Score: 63.53163540340026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper attempts to address the object repetition issue in patch-wise higher-resolution image generation. We propose AccDiffusion, an accurate method for patch-wise higher-resolution image generation without training. An in-depth analysis in this paper reveals an identical text prompt for different patches causes repeated object generation, while no prompt compromises the image details. Therefore, our AccDiffusion, for the first time, proposes to decouple the vanilla image-content-aware prompt into a set of patch-content-aware prompts, each of which serves as a more precise description of an image patch. Besides, AccDiffusion also introduces dilated sampling with window interaction for better global consistency in higher-resolution image generation. Experimental comparison with existing methods demonstrates that our AccDiffusion effectively addresses the issue of repeated object generation and leads to better performance in higher-resolution image generation.
Related papers
- Prompt Recovery for Image Generation Models: A Comparative Study of Discrete Optimizers [58.50071292008407]
We present the first head-to-head comparison of recent discrete optimization techniques for the problem of prompt inversion.
We find that focusing on the CLIP similarity between the inverted prompts and the ground truth image acts as a poor proxy for the similarity between ground truth image and the image generated by the inverted prompts.
arXiv Detail & Related papers (2024-08-12T21:35:59Z) - HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models [13.68666823175341]
HiDiffusion is a tuning-free higher-resolution framework for image synthesis.
RAU-Net dynamically adjusts the feature map size to resolve object duplication.
MSW-MSA engages optimized window attention to reduce computations.
arXiv Detail & Related papers (2023-11-29T11:01:38Z) - Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution [15.391125077873745]
Scene Text Image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images.
Previous methods predominantly employ discriminative Convolutional Neural Networks (CNNs) augmented with diverse forms of text guidance.
We introduce RGDiffSR, a Recognition-Guided Diffusion model for scene text image Super-Resolution, which exhibits great generative diversity and fidelity even in challenging scenarios.
arXiv Detail & Related papers (2023-11-22T11:10:45Z) - Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size
HD Images [56.17404812357676]
Stable diffusion, a generative model used in text-to-image synthesis, frequently encounters composition problems when generating images of varying sizes.
We propose a two-stage pipeline named Any-Size-Diffusion (ASD), designed to efficiently generate well-composed images of any size.
We show that ASD can produce well-structured images of arbitrary sizes, cutting down the inference time by 2x compared to the traditional tiled algorithm.
arXiv Detail & Related papers (2023-08-31T09:27:56Z) - Hierarchical Similarity Learning for Aliasing Suppression Image
Super-Resolution [64.15915577164894]
A hierarchical image super-resolution network (HSRNet) is proposed to suppress the influence of aliasing.
HSRNet achieves better quantitative and visual performance than other works, and remits the aliasing more effectively.
arXiv Detail & Related papers (2022-06-07T14:55:32Z) - Attention-based Multi-Reference Learning for Image Super-Resolution [29.361342747786164]
This paper proposes a novel Attention-based Multi-Reference Super-resolution network.
It learns to adaptively transfer the most similar texture from multiple reference images to the super-resolution output.
It achieves significantly improved performance over state-of-the-art reference super-resolution approaches.
arXiv Detail & Related papers (2021-08-31T09:12:26Z) - Low Resolution Information Also Matters: Learning Multi-Resolution
Representations for Person Re-Identification [37.01666917620271]
Cross-resolution person re-ID aims to match person images captured from non-overlapped cameras.
emphtextbfMulti-Resolution textbfRepresentations textbfJoint textbfLearning (textbfMRJL)
Our method consists of a Resolution Reconstruction Network (RRN) and a Dual Feature Fusion Network (DFFN)
arXiv Detail & Related papers (2021-05-26T16:54:56Z) - High-Resolution Image Inpainting with Iterative Confidence Feedback and
Guided Upsampling [122.06593036862611]
Existing image inpainting methods often produce artifacts when dealing with large holes in real applications.
We propose an iterative inpainting method with a feedback mechanism.
Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2020-05-24T13:23:45Z) - Unsupervised Real Image Super-Resolution via Generative Variational
AutoEncoder [47.53609520395504]
We revisit the classic example based image super-resolution approaches and come up with a novel generative model for perceptual image super-resolution.
We propose a joint image denoising and super-resolution model via Variational AutoEncoder.
With the aid of the discriminator, an additional overhead of super-resolution subnetwork is attached to super-resolve the denoised image with photo-realistic visual quality.
arXiv Detail & Related papers (2020-04-27T13:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.