Related papers: Learning Multi-scale Spatial-frequency Features for Image Denoising

Learning Multi-scale Spatial-frequency Features for Image Denoising

URL: http://arxiv.org/abs/2506.16307v1
Date: Thu, 19 Jun 2025 13:28:09 GMT
Title: Learning Multi-scale Spatial-frequency Features for Image Denoising
Authors: Xu Zhao, Chen Zhao, Xiantao Hu, Hongliang Zhang, Ying Tai, Jian Yang,
Abstract summary: We propose a novel multi-scale adaptive dual-domain network (MADNet) for image denoising.<n>We use image pyramid inputs to restore noise-free results from low-resolution images.<n>In order to realize the interaction of high-frequency and low-frequency information, we design an adaptive spatial-frequency learning unit.
Score: 58.883244886588336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in multi-scale architectures have demonstrated exceptional performance in image denoising tasks. However, existing architectures mainly depends on a fixed single-input single-output Unet architecture, ignoring the multi-scale representations of pixel level. In addition, previous methods treat the frequency domain uniformly, ignoring the different characteristics of high-frequency and low-frequency noise. In this paper, we propose a novel multi-scale adaptive dual-domain network (MADNet) for image denoising. We use image pyramid inputs to restore noise-free results from low-resolution images. In order to realize the interaction of high-frequency and low-frequency information, we design an adaptive spatial-frequency learning unit (ASFU), where a learnable mask is used to separate the information into high-frequency and low-frequency components. In the skip connections, we design a global feature fusion block to enhance the features at different scales. Extensive experiments on both synthetic and real noisy image datasets verify the effectiveness of MADNet compared with current state-of-the-art denoising approaches.

Related papers

Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z)
Multi-View Learning with Context-Guided Receptance for Image Denoising [18.175992709188026]
Image denoising is essential in low-level vision applications such as photography and automated driving.<n>Existing methods struggle with distinguishing complex noise patterns in real-world scenes and consume significant computational resources.<n>In this work, a Context-guided Receptance Weighted Key-Value (M) model is proposed, combining enhanced multi-view feature integration with efficient sequence modeling.<n>The model is validated on multiple real-world image denoising datasets, outperforming the existing state-of-the-art methods quantitatively and reducing inference time up to 40%.
arXiv Detail & Related papers (2025-05-05T14:57:43Z)
Heterogeneous window transformer for image denoising [59.953076646860985]
We propose a heterogeneous window transformer (HWformer) for image denoising. The proposed HWformer only takes 30% of popular Restormer in terms of denoising time.
arXiv Detail & Related papers (2024-07-08T08:10:16Z)
Low-light Stereo Image Enhancement and De-noising in the Low-frequency Information Enhanced Image Space [5.1569866461097185]
Methods are proposed to perform enhancement and de-noising simultaneously. Low-frequency information enhanced module (IEM) is proposed to suppress noise and produce a new image space. Cross-channel and spatial context information mining module (CSM) is proposed to encode long-range spatial dependencies. An encoder-decoder structure is constructed, incorporating cross-view and cross-scale feature interactions.
arXiv Detail & Related papers (2024-01-15T15:03:32Z)
DiffiT: Diffusion Vision Transformers for Image Generation [88.08529836125399]
Vision Transformer (ViT) has demonstrated strong modeling capabilities and scalability, especially for recognition tasks. We study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT) DiffiT is surprisingly effective in generating high-fidelity images with significantly better parameter efficiency.
arXiv Detail & Related papers (2023-12-04T18:57:01Z)
Complex Image Generation SwinTransformer Network for Audio Denoising [20.11487887319951]
This paper converts the audio denoising problem into an image generation task. We first develop a complex image generation SwinTransformer network to capture more information from the complex Fourier domain. We then impose structure similarity and detailed loss functions to generate high-quality images and develop an SDR loss to minimize the difference between denoised and clean audios.
arXiv Detail & Related papers (2023-10-24T18:21:03Z)
Advancing Unsupervised Low-light Image Enhancement: Noise Estimation, Illumination Interpolation, and Self-Regulation [55.07472635587852]
Low-Light Image Enhancement (LLIE) techniques have made notable advancements in preserving image details and enhancing contrast. These approaches encounter persistent challenges in efficiently mitigating dynamic noise and accommodating diverse low-light scenarios. We first propose a method for estimating the noise level in low light images in a quick and accurate way. We then devise a Learnable Illumination Interpolator (LII) to satisfy general constraints between illumination and input.
arXiv Detail & Related papers (2023-05-17T13:56:48Z)
Multi-scale Adaptive Fusion Network for Hyperspectral Image Denoising [35.491878332394265]
We propose a novel solution to investigate the HSI denoising using a Multi-scale Adaptive Fusion Network (MAFNet) The proposed MAFNet has achieved better denoising performance than other state-of-the-art techniques.
arXiv Detail & Related papers (2023-04-19T02:00:21Z)
Multi-stage image denoising with the wavelet transform [125.2251438120701]
Deep convolutional neural networks (CNNs) are used for image denoising via automatically mining accurate structure information. We propose a multi-stage image denoising CNN with the wavelet transform (MWDCNN) via three stages, i.e., a dynamic convolutional block (DCB), two cascaded wavelet transform and enhancement blocks (WEBs) and residual block (RB)
arXiv Detail & Related papers (2022-09-26T03:28:23Z)
Multi-scale frequency separation network for image deblurring [10.511076996096117]
We present a new method called multi-scale frequency separation network (MSFS-Net) for image deblurring. MSFS-Net captures the low and high-frequency information of image at multiple scales. Experiments on benchmark datasets show that the proposed network achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-06-01T23:48:35Z)
Progressive Training of Multi-level Wavelet Residual Networks for Image Denoising [80.10533234415237]
This paper presents a multi-level wavelet residual network (MWRN) architecture as well as a progressive training scheme to improve image denoising performance. Experiments on both synthetic and real-world noisy images show that our PT-MWRN performs favorably against the state-of-the-art denoising methods.
arXiv Detail & Related papers (2020-10-23T14:14:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.