Related papers: APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing

APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing

URL: http://arxiv.org/abs/2507.21690v1
Date: Tue, 29 Jul 2025 11:13:03 GMT
Title: APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing
Authors: Sangmin Han, Jinho Jeong, Jinwoo Kim, Seon Joo Kim,
Abstract summary: Latent Diffusion Models (LDMs) are generally trained at fixed resolutions, limiting their capability when scaling to high-resolution images.<n>We propose Adaptive Path Tracing (APT), a framework that combines Statistical Matching to ensure patch distributions remain consistent.<n>As a result, APT produces clearer and more refined details in high-resolution images.
Score: 24.33819371470651
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Latent Diffusion Models (LDMs) are generally trained at fixed resolutions, limiting their capability when scaling up to high-resolution images. While training-based approaches address this limitation by training on high-resolution datasets, they require large amounts of data and considerable computational resources, making them less practical. Consequently, training-free methods, particularly patch-based approaches, have become a popular alternative. These methods divide an image into patches and fuse the denoising paths of each patch, showing strong performance on high-resolution generation. However, we observe two critical issues for patch-based approaches, which we call ``patch-level distribution shift" and ``increased patch monotonicity." To address these issues, we propose Adaptive Path Tracing (APT), a framework that combines Statistical Matching to ensure patch distributions remain consistent in upsampled latents and Scale-aware Scheduling to deal with the patch monotonicity. As a result, APT produces clearer and more refined details in high-resolution images. In addition, APT enables a shortcut denoising process, resulting in faster sampling with minimal quality degradation. Our experimental results confirm that APT produces more detailed outputs with improved inference speed, providing a practical approach to high-resolution image generation.

Related papers

Diffusion Models for Solving Inverse Problems via Posterior Sampling with Piecewise Guidance [52.705112811734566]
A novel diffusion-based framework is introduced for solving inverse problems using a piecewise guidance scheme.<n>The proposed method is problem-agnostic and readily adaptable to a variety of inverse problems.<n>The framework achieves a reduction in inference time of (25%) for inpainting with both random and center masks, and (23%) and (24%) for (4times) and (8times) super-resolution tasks.
arXiv Detail & Related papers (2025-07-22T19:35:14Z)
Progressive Alignment Degradation Learning for Pansharpening [3.7939736380306552]
Deep learning-based pansharpening has been shown to effectively generate high-resolution multispectral (HRMS) images.<n>The Wald protocol assumes that networks trained on artificial low-resolution data will perform equally well on high-resolution data.<n>We proposePADM, which uses mutual iteration between two sub-networks, PAlignNet and PDegradeNet, to adaptively learn accurate degradation processes.
arXiv Detail & Related papers (2025-06-25T07:07:32Z)
Multi-Step Guided Diffusion for Image Restoration on Edge Devices: Toward Lightweight Perception in Embodied AI [0.0]
We introduce a multistep optimization strategy within each denoising timestep, significantly enhancing image quality, perceptual accuracy, and generalization.<n>Our experiments on super-resolution and Gaussian deblurring demonstrate that increasing the number of gradient updates per step improves LPIPS and PSNR with minimal latency overhead.<n>Our findings highlight MPGD's potential as a lightweight, plug-and-play restoration module for real-time visual perception in embodied AI agents such as drones and mobile robots.
arXiv Detail & Related papers (2025-06-08T21:11:25Z)
Two-Stage Random Alternation Framework for One-Shot Pansharpening [12.385955231193675]
We introduce a two-stage random alternating framework (TRA-PAN) that performs instance-specific optimization for any given Multispectral(MS)/Panchromatic(PAN) pair.<n>TRA-PAN effectively integrates strong supervision constraints from reduced-resolution images with the physical characteristics of the full-resolution images.<n> Experimental results demonstrate that TRA-PAN outperforms state-of-the-art (SOTA) methods in quantitative metrics and visual quality in real-world scenarios.
arXiv Detail & Related papers (2025-05-10T09:26:22Z)
One Look is Enough: A Novel Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation Models on High-Resolution Images [25.48185527420231]
We propose Patch Refine Once (PRO), an efficient and generalizable tile-based framework.<n>Our PRO consists of two key components: (i) Grouped Patch Consistency Training that enhances test-time efficiency while mitigating the depth discontinuity problem.<n>Bias Free Masking prevents the DE models from overfitting to dataset-specific biases, enabling better generalization to real-world datasets even after training on synthetic data.
arXiv Detail & Related papers (2025-03-28T11:46:50Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion [63.609399000712905]
Inference at a scaled resolution leads to repetitive patterns and structural distortions.<n>We propose two simple modules that combine to solve these issues.<n>Our method, coined Fam diffusion, can seamlessly integrate into any latent diffusion model and requires no additional training.
arXiv Detail & Related papers (2024-11-27T17:51:44Z)
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors. We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods. Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time [7.532695984765271]
We present a novel approach to generate high-resolution images with generative models.<n>Our method shifts non-overlapping denoising windows over time, ensuring that seams in one timestep are corrected in the next.<n>Our method offers several key benefits, including improved computational efficiency and faster inference times.
arXiv Detail & Related papers (2024-07-22T09:44:35Z)
PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution [44.345740602726345]
PatchScaler is an efficient patch-independent diffusion pipeline for single image super-resolution. A texture prompt adaptively retrieves texture priors for the target patch from a common reference texture memory. Our code achieves superior performance in both quantitative and qualitative evaluations, while significantly speeding up inference.
arXiv Detail & Related papers (2024-05-27T13:31:46Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.