Structure-guided Diffusion Transformer for Low-Light Image Enhancement
- URL: http://arxiv.org/abs/2504.15054v1
- Date: Mon, 21 Apr 2025 12:30:01 GMT
- Title: Structure-guided Diffusion Transformer for Low-Light Image Enhancement
- Authors: Xiangchen Yin, Zhenda Yu, Longtao Jiang, Xin Gao, Xiao Sun, Zhi Liu, Xun Yang,
- Abstract summary: We introduce DiT into the low-light enhancement task and design a novel Structure-guided Diffusion Transformer based Low-light image enhancement framework.<n>We compress the feature through wavelet transform to improve the inference efficiency of the model and capture the multi-directional frequency band.<n>In addition, we propose a Structure-guided Attention Block (SAB) to pay more attention to texture-riched tokens and avoid interference from noisy areas in noise prediction.
- Score: 19.90700077104533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While the diffusion transformer (DiT) has become a focal point of interest in recent years, its application in low-light image enhancement remains a blank area for exploration. Current methods recover the details from low-light images while inevitably amplifying the noise in images, resulting in poor visual quality. In this paper, we firstly introduce DiT into the low-light enhancement task and design a novel Structure-guided Diffusion Transformer based Low-light image enhancement (SDTL) framework. We compress the feature through wavelet transform to improve the inference efficiency of the model and capture the multi-directional frequency band. Then we propose a Structure Enhancement Module (SEM) that uses structural prior to enhance the texture and leverages an adaptive fusion strategy to achieve more accurate enhancement effect. In Addition, we propose a Structure-guided Attention Block (SAB) to pay more attention to texture-riched tokens and avoid interference from noisy areas in noise prediction. Extensive qualitative and quantitative experiments demonstrate that our method achieves SOTA performance on several popular datasets, validating the effectiveness of SDTL in improving image quality and the potential of DiT in low-light enhancement tasks.
Related papers
- DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains [0.0]
Low-light image enhancement (LLE) aims to improve the visual quality of images captured in poorly lit conditions.<n>These issues hinder the performance of computer vision tasks such as object detection, facial recognition, and autonomous driving.<n>We propose the Dual Light Enhance Network (DLEN), a novel architecture that incorporates two distinct attention mechanisms.
arXiv Detail & Related papers (2025-01-21T15:58:16Z) - Unveiling Advanced Frequency Disentanglement Paradigm for Low-Light Image Enhancement [61.22119364400268]
We propose a novel low-frequency consistency method, facilitating improved frequency disentanglement optimization.
Noteworthy improvements are showcased across five popular benchmarks, with up to 7.68dB gains on PSNR achieved for six state-of-the-art models.
Our approach maintains efficiency with only 88K extra parameters, setting a new standard in the challenging realm of low-light image enhancement.
arXiv Detail & Related papers (2024-09-03T06:19:03Z) - CodeEnhance: A Codebook-Driven Approach for Low-Light Image Enhancement [97.95330185793358]
Low-light image enhancement (LLIE) aims to improve low-illumination images.
Existing methods face two challenges: uncertainty in restoration from diverse brightness degradations and loss of texture and color information.
We propose a novel enhancement approach, CodeEnhance, by leveraging quantized priors and image refinement.
arXiv Detail & Related papers (2024-04-08T07:34:39Z) - Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion [28.049668999586583]
We propose a novel and robust low-light image enhancement method via CLIP-Fourier Guided Wavelet Diffusion, abbreviated as CFWD.
CFWD leverages multimodal visual-language information in the frequency domain space created by multiple wavelet transforms to guide the enhancement process.
Our approach outperforms existing state-of-the-art methods, achieving significant progress in image quality and noise suppression.
arXiv Detail & Related papers (2024-01-08T10:08:48Z) - A Non-Uniform Low-Light Image Enhancement Method with Multi-Scale
Attention Transformer and Luminance Consistency Loss [11.585269110131659]
Low-light image enhancement aims to improve the perception of images collected in dim environments.
Existing methods cannot adaptively extract the differentiated luminance information, which will easily cause over-exposure and under-exposure.
We propose a multi-scale attention Transformer named MSATr, which sufficiently extracts local and global features for light balance to improve the visual quality.
arXiv Detail & Related papers (2023-12-27T10:07:11Z) - LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models [54.93010869546011]
We propose to leverage the pre-trained latent diffusion model to perform the neural ISP for enhancing extremely low-light images.<n>Specifically, to tailor the pre-trained latent diffusion model to operate on the RAW domain, we train a set of lightweight taming modules.<n>We observe different roles of UNet denoising and decoder reconstruction in the latent diffusion model, which inspires us to decompose the low-light image enhancement task into latent-space low-frequency content generation and decoding-phase high-frequency detail maintenance.
arXiv Detail & Related papers (2023-12-02T04:31:51Z) - Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions.
The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result.
To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z) - Cycle-Interactive Generative Adversarial Network for Robust Unsupervised
Low-Light Enhancement [109.335317310485]
Cycle-Interactive Generative Adversarial Network (CIGAN) is capable of not only better transferring illumination distributions between low/normal-light images but also manipulating detailed signals.
In particular, the proposed low-light guided transformation feed-forwards the features of low-light images from the generator of enhancement GAN into the generator of degradation GAN.
arXiv Detail & Related papers (2022-07-03T06:37:46Z) - Unsupervised Low-light Image Enhancement with Decoupled Networks [103.74355338972123]
We learn a two-stage GAN-based framework to enhance the real-world low-light images in a fully unsupervised fashion.
Our proposed method outperforms the state-of-the-art unsupervised image enhancement methods in terms of both illumination enhancement and noise reduction.
arXiv Detail & Related papers (2020-05-06T13:37:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.