APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing
- URL: http://arxiv.org/abs/2507.21690v1
- Date: Tue, 29 Jul 2025 11:13:03 GMT
- Title: APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing
- Authors: Sangmin Han, Jinho Jeong, Jinwoo Kim, Seon Joo Kim,
- Abstract summary: Latent Diffusion Models (LDMs) are generally trained at fixed resolutions, limiting their capability when scaling to high-resolution images.<n>We propose Adaptive Path Tracing (APT), a framework that combines Statistical Matching to ensure patch distributions remain consistent.<n>As a result, APT produces clearer and more refined details in high-resolution images.
- Score: 24.33819371470651
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Latent Diffusion Models (LDMs) are generally trained at fixed resolutions, limiting their capability when scaling up to high-resolution images. While training-based approaches address this limitation by training on high-resolution datasets, they require large amounts of data and considerable computational resources, making them less practical. Consequently, training-free methods, particularly patch-based approaches, have become a popular alternative. These methods divide an image into patches and fuse the denoising paths of each patch, showing strong performance on high-resolution generation. However, we observe two critical issues for patch-based approaches, which we call ``patch-level distribution shift" and ``increased patch monotonicity." To address these issues, we propose Adaptive Path Tracing (APT), a framework that combines Statistical Matching to ensure patch distributions remain consistent in upsampled latents and Scale-aware Scheduling to deal with the patch monotonicity. As a result, APT produces clearer and more refined details in high-resolution images. In addition, APT enables a shortcut denoising process, resulting in faster sampling with minimal quality degradation. Our experimental results confirm that APT produces more detailed outputs with improved inference speed, providing a practical approach to high-resolution image generation.
Related papers
- Diffusion Models for Solving Inverse Problems via Posterior Sampling with Piecewise Guidance [52.705112811734566]
A novel diffusion-based framework is introduced for solving inverse problems using a piecewise guidance scheme.<n>The proposed method is problem-agnostic and readily adaptable to a variety of inverse problems.<n>The framework achieves a reduction in inference time of (25%) for inpainting with both random and center masks, and (23%) and (24%) for (4times) and (8times) super-resolution tasks.
arXiv Detail & Related papers (2025-07-22T19:35:14Z) - Progressive Alignment Degradation Learning for Pansharpening [3.7939736380306552]
Deep learning-based pansharpening has been shown to effectively generate high-resolution multispectral (HRMS) images.<n>The Wald protocol assumes that networks trained on artificial low-resolution data will perform equally well on high-resolution data.<n>We proposePADM, which uses mutual iteration between two sub-networks, PAlignNet and PDegradeNet, to adaptively learn accurate degradation processes.
arXiv Detail & Related papers (2025-06-25T07:07:32Z) - Multi-Step Guided Diffusion for Image Restoration on Edge Devices: Toward Lightweight Perception in Embodied AI [0.0]
We introduce a multistep optimization strategy within each denoising timestep, significantly enhancing image quality, perceptual accuracy, and generalization.<n>Our experiments on super-resolution and Gaussian deblurring demonstrate that increasing the number of gradient updates per step improves LPIPS and PSNR with minimal latency overhead.<n>Our findings highlight MPGD's potential as a lightweight, plug-and-play restoration module for real-time visual perception in embodied AI agents such as drones and mobile robots.
arXiv Detail & Related papers (2025-06-08T21:11:25Z) - Two-Stage Random Alternation Framework for One-Shot Pansharpening [12.385955231193675]
We introduce a two-stage random alternating framework (TRA-PAN) that performs instance-specific optimization for any given Multispectral(MS)/Panchromatic(PAN) pair.<n>TRA-PAN effectively integrates strong supervision constraints from reduced-resolution images with the physical characteristics of the full-resolution images.<n> Experimental results demonstrate that TRA-PAN outperforms state-of-the-art (SOTA) methods in quantitative metrics and visual quality in real-world scenarios.
arXiv Detail & Related papers (2025-05-10T09:26:22Z) - One Look is Enough: A Novel Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation Models on High-Resolution Images [25.48185527420231]
We propose Patch Refine Once (PRO), an efficient and generalizable tile-based framework.<n>Our PRO consists of two key components: (i) Grouped Patch Consistency Training that enhances test-time efficiency while mitigating the depth discontinuity problem.<n>Bias Free Masking prevents the DE models from overfitting to dataset-specific biases, enabling better generalization to real-world datasets even after training on synthetic data.
arXiv Detail & Related papers (2025-03-28T11:46:50Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion [63.609399000712905]
Inference at a scaled resolution leads to repetitive patterns and structural distortions.<n>We propose two simple modules that combine to solve these issues.<n>Our method, coined Fam diffusion, can seamlessly integrate into any latent diffusion model and requires no additional training.
arXiv Detail & Related papers (2024-11-27T17:51:44Z) - Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors.
We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods.
Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z) - One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time [7.532695984765271]
We present a novel approach to generate high-resolution images with generative models.<n>Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next.<n>Our method offers several key benefits, including improved computational efficiency and faster inference times.
arXiv Detail & Related papers (2024-07-22T09:44:35Z) - PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution [44.345740602726345]
PatchScaler is an efficient patch-independent diffusion pipeline for single image super-resolution.
A texture prompt adaptively retrieves texture priors for the target patch from a common reference texture memory.
Our code achieves superior performance in both quantitative and qualitative evaluations, while significantly speeding up inference.
arXiv Detail & Related papers (2024-05-27T13:31:46Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.