Spatial-Frequency Attention for Image Denoising
- URL: http://arxiv.org/abs/2302.13598v1
- Date: Mon, 27 Feb 2023 09:07:15 GMT
- Title: Spatial-Frequency Attention for Image Denoising
- Authors: Shi Guo, Hongwei Yong, Xindong Zhang, Jianqi Ma and Lei Zhang
- Abstract summary: We propose the spatial-frequency attention network (SFANet) to enhance the network's ability in exploiting long-range dependency.
Experiments on multiple denoising benchmarks demonstrate the leading performance of SFANet network.
- Score: 22.993509525990998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recently developed transformer networks have achieved impressive
performance in image denoising by exploiting the self-attention (SA) in images.
However, the existing methods mostly use a relatively small window to compute
SA due to the quadratic complexity of it, which limits the model's ability to
model long-term image information. In this paper, we propose the
spatial-frequency attention network (SFANet) to enhance the network's ability
in exploiting long-range dependency. For spatial attention module (SAM), we
adopt dilated SA to model long-range dependency. In the frequency attention
module (FAM), we exploit more global information by using Fast Fourier
Transform (FFT) by designing a window-based frequency channel attention (WFCA)
block to effectively model deep frequency features and their dependencies. To
make our module applicable to images of different sizes and keep the model
consistency between training and inference, we apply window-based FFT with a
set of fixed window sizes. In addition, channel attention is computed on both
real and imaginary parts of the Fourier spectrum, which further improves
restoration performance. The proposed WFCA block can effectively model image
long-range dependency with acceptable complexity. Experiments on multiple
denoising benchmarks demonstrate the leading performance of SFANet network.
Related papers
- Channel-Partitioned Windowed Attention And Frequency Learning for Single Image Super-Resolution [1.8506868409351092]
Window-based attention methods have shown great potential for computer vision tasks, particularly in Single Image Super-Resolution (SISR)
We propose a new Channel-Partitioned Attention Transformer (CPAT) to better capture long-range dependencies by sequentially expanding windows along the height and width of feature maps.
In addition, we propose a novel Spatial-Frequency Interaction Module (SFIM), which incorporates information from spatial and frequency domains to provide a more comprehensive information from feature maps.
arXiv Detail & Related papers (2024-07-23T07:17:10Z) - Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models [26.926712014346432]
This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization.
Our method's efficacy is demonstrated on the class-conditional ImageNet generation benchmark, setting new state-of-the-art FID scores of 1.70 on ImageNet 256 x 256 and 2.89 on ImageNet 512 x 512.
arXiv Detail & Related papers (2024-06-13T17:59:58Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models [89.76587063609806]
We study the denoising diffusion probabilistic model (DDPM) in wavelet space, instead of pixel space, for visual synthesis.
By explicitly modeling the wavelet signals, we find our model is able to generate images with higher quality on several datasets.
arXiv Detail & Related papers (2023-07-27T06:53:16Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - Multi-scale frequency separation network for image deblurring [10.511076996096117]
We present a new method called multi-scale frequency separation network (MSFS-Net) for image deblurring.
MSFS-Net captures the low and high-frequency information of image at multiple scales.
Experiments on benchmark datasets show that the proposed network achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-06-01T23:48:35Z) - FreqNet: A Frequency-domain Image Super-Resolution Network with Dicrete
Cosine Transform [16.439669339293747]
Single image super-resolution(SISR) is an ill-posed problem that aims to obtain high-resolution (HR) output from low-resolution (LR) input.
Despite the high peak signal-to-noise ratios(PSNR) results, it is difficult to determine whether the model correctly adds desired high-frequency details.
We propose FreqNet, an intuitive pipeline from the frequency domain perspective, to solve this problem.
arXiv Detail & Related papers (2021-11-21T11:49:12Z) - Global Filter Networks for Image Classification [90.81352483076323]
We present a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
Our results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness.
arXiv Detail & Related papers (2021-07-01T17:58:16Z) - Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years.
We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.
Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.