Learning Multi-scale Spatial-frequency Features for Image Denoising
- URL: http://arxiv.org/abs/2506.16307v1
- Date: Thu, 19 Jun 2025 13:28:09 GMT
- Title: Learning Multi-scale Spatial-frequency Features for Image Denoising
- Authors: Xu Zhao, Chen Zhao, Xiantao Hu, Hongliang Zhang, Ying Tai, Jian Yang,
- Abstract summary: We propose a novel multi-scale adaptive dual-domain network (MADNet) for image denoising.<n>We use image pyramid inputs to restore noise-free results from low-resolution images.<n>In order to realize the interaction of high-frequency and low-frequency information, we design an adaptive spatial-frequency learning unit.
- Score: 58.883244886588336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in multi-scale architectures have demonstrated exceptional performance in image denoising tasks. However, existing architectures mainly depends on a fixed single-input single-output Unet architecture, ignoring the multi-scale representations of pixel level. In addition, previous methods treat the frequency domain uniformly, ignoring the different characteristics of high-frequency and low-frequency noise. In this paper, we propose a novel multi-scale adaptive dual-domain network (MADNet) for image denoising. We use image pyramid inputs to restore noise-free results from low-resolution images. In order to realize the interaction of high-frequency and low-frequency information, we design an adaptive spatial-frequency learning unit (ASFU), where a learnable mask is used to separate the information into high-frequency and low-frequency components. In the skip connections, we design a global feature fusion block to enhance the features at different scales. Extensive experiments on both synthetic and real noisy image datasets verify the effectiveness of MADNet compared with current state-of-the-art denoising approaches.
Related papers
- Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z) - Multi-View Learning with Context-Guided Receptance for Image Denoising [18.175992709188026]
Image denoising is essential in low-level vision applications such as photography and automated driving.<n>Existing methods struggle with distinguishing complex noise patterns in real-world scenes and consume significant computational resources.<n>In this work, a Context-guided Receptance Weighted Key-Value (M) model is proposed, combining enhanced multi-view feature integration with efficient sequence modeling.<n>The model is validated on multiple real-world image denoising datasets, outperforming the existing state-of-the-art methods quantitatively and reducing inference time up to 40%.
arXiv Detail & Related papers (2025-05-05T14:57:43Z) - Heterogeneous window transformer for image denoising [59.953076646860985]
We propose a heterogeneous window transformer (HWformer) for image denoising.
The proposed HWformer only takes 30% of popular Restormer in terms of denoising time.
arXiv Detail & Related papers (2024-07-08T08:10:16Z) - Low-light Stereo Image Enhancement and De-noising in the Low-frequency
Information Enhanced Image Space [5.1569866461097185]
Methods are proposed to perform enhancement and de-noising simultaneously.
Low-frequency information enhanced module (IEM) is proposed to suppress noise and produce a new image space.
Cross-channel and spatial context information mining module (CSM) is proposed to encode long-range spatial dependencies.
An encoder-decoder structure is constructed, incorporating cross-view and cross-scale feature interactions.
arXiv Detail & Related papers (2024-01-15T15:03:32Z) - DiffiT: Diffusion Vision Transformers for Image Generation [88.08529836125399]
Vision Transformer (ViT) has demonstrated strong modeling capabilities and scalability, especially for recognition tasks.
We study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT)
DiffiT is surprisingly effective in generating high-fidelity images with significantly better parameter efficiency.
arXiv Detail & Related papers (2023-12-04T18:57:01Z) - Complex Image Generation SwinTransformer Network for Audio Denoising [20.11487887319951]
This paper converts the audio denoising problem into an image generation task.
We first develop a complex image generation SwinTransformer network to capture more information from the complex Fourier domain.
We then impose structure similarity and detailed loss functions to generate high-quality images and develop an SDR loss to minimize the difference between denoised and clean audios.
arXiv Detail & Related papers (2023-10-24T18:21:03Z) - Advancing Unsupervised Low-light Image Enhancement: Noise Estimation, Illumination Interpolation, and Self-Regulation [55.07472635587852]
Low-Light Image Enhancement (LLIE) techniques have made notable advancements in preserving image details and enhancing contrast.
These approaches encounter persistent challenges in efficiently mitigating dynamic noise and accommodating diverse low-light scenarios.
We first propose a method for estimating the noise level in low light images in a quick and accurate way.
We then devise a Learnable Illumination Interpolator (LII) to satisfy general constraints between illumination and input.
arXiv Detail & Related papers (2023-05-17T13:56:48Z) - Multi-scale Adaptive Fusion Network for Hyperspectral Image Denoising [35.491878332394265]
We propose a novel solution to investigate the HSI denoising using a Multi-scale Adaptive Fusion Network (MAFNet)
The proposed MAFNet has achieved better denoising performance than other state-of-the-art techniques.
arXiv Detail & Related papers (2023-04-19T02:00:21Z) - Multi-stage image denoising with the wavelet transform [125.2251438120701]
Deep convolutional neural networks (CNNs) are used for image denoising via automatically mining accurate structure information.
We propose a multi-stage image denoising CNN with the wavelet transform (MWDCNN) via three stages, i.e., a dynamic convolutional block (DCB), two cascaded wavelet transform and enhancement blocks (WEBs) and residual block (RB)
arXiv Detail & Related papers (2022-09-26T03:28:23Z) - Multi-scale frequency separation network for image deblurring [10.511076996096117]
We present a new method called multi-scale frequency separation network (MSFS-Net) for image deblurring.
MSFS-Net captures the low and high-frequency information of image at multiple scales.
Experiments on benchmark datasets show that the proposed network achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-06-01T23:48:35Z) - Progressive Training of Multi-level Wavelet Residual Networks for Image
Denoising [80.10533234415237]
This paper presents a multi-level wavelet residual network (MWRN) architecture as well as a progressive training scheme to improve image denoising performance.
Experiments on both synthetic and real-world noisy images show that our PT-MWRN performs favorably against the state-of-the-art denoising methods.
arXiv Detail & Related papers (2020-10-23T14:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.