Efficient Frequency Domain-based Transformers for High-Quality Image
Deblurring
- URL: http://arxiv.org/abs/2211.12250v1
- Date: Tue, 22 Nov 2022 13:08:03 GMT
- Title: Efficient Frequency Domain-based Transformers for High-Quality Image
Deblurring
- Authors: Lingshun Kong, Jiangxin Dong, Mingqiang Li, Jianjun Ge, Jinshan Pan
- Abstract summary: We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring.
We formulate the proposed FSAS and DFFN into an asymmetrical network based on an encoder and decoder architecture.
- Score: 39.720032882926176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an effective and efficient method that explores the properties of
Transformers in the frequency domain for high-quality image deblurring. Our
method is motivated by the convolution theorem that the correlation or
convolution of two signals in the spatial domain is equivalent to an
element-wise product of them in the frequency domain. This inspires us to
develop an efficient frequency domain-based self-attention solver (FSAS) to
estimate the scaled dot-product attention by an element-wise product operation
instead of the matrix multiplication in the spatial domain. In addition, we
note that simply using the naive feed-forward network (FFN) in Transformers
does not generate good deblurred results. To overcome this problem, we propose
a simple yet effective discriminative frequency domain-based FFN (DFFN), where
we introduce a gated mechanism in the FFN based on the Joint Photographic
Experts Group (JPEG) compression algorithm to discriminatively determine which
low- and high-frequency information of the features should be preserved for
latent clear image restoration. We formulate the proposed FSAS and DFFN into an
asymmetrical network based on an encoder and decoder architecture, where the
FSAS is only used in the decoder module for better image deblurring.
Experimental results show that the proposed method performs favorably against
the state-of-the-art approaches. Code will be available at
\url{https://github.com/kkkls/FFTformer}.
Related papers
- F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring [8.296475046681696]
We propose a novel approach based on the Fractional Fourier Transform (FRFT), a unified spatial-frequency representation.
We show that the performance of our proposed method is superior to other state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2024-09-03T17:05:12Z) - Misalignment-Robust Frequency Distribution Loss for Image Transformation [51.0462138717502]
This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution.
We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain.
Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
arXiv Detail & Related papers (2024-02-28T09:27:41Z) - Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models [89.76587063609806]
We study the denoising diffusion probabilistic model (DDPM) in wavelet space, instead of pixel space, for visual synthesis.
By explicitly modeling the wavelet signals, we find our model is able to generate images with higher quality on several datasets.
arXiv Detail & Related papers (2023-07-27T06:53:16Z) - Adaptive Frequency Filters As Efficient Global Token Mixers [100.27957692579892]
We show that adaptive frequency filters can serve as efficient global token mixers.
We take AFF token mixers as primary neural operators to build a lightweight neural network, dubbed AFFNet.
arXiv Detail & Related papers (2023-07-26T07:42:28Z) - Complementary Frequency-Varying Awareness Network for Open-Set
Fine-Grained Image Recognition [14.450381668547259]
Open-set image recognition is a challenging topic in computer vision.
We propose a Complementary Frequency-varying Awareness Network that could better capture both high-frequency and low-frequency information.
Based on CFAN, we propose an open-set fine-grained image recognition method, called CFAN-OSFGR.
arXiv Detail & Related papers (2023-07-14T08:15:36Z) - Adaptive Fourier Neural Operators: Efficient Token Mixers for
Transformers [55.90468016961356]
We propose an efficient token mixer that learns to mix in the Fourier domain.
AFNO is based on a principled foundation of operator learning.
It can handle a sequence size of 65k and outperforms other efficient self-attention mechanisms.
arXiv Detail & Related papers (2021-11-24T05:44:31Z) - TBNet:Two-Stream Boundary-aware Network for Generic Image Manipulation
Localization [49.521622399483846]
We propose a novel end-to-end two-stream boundary-aware network (abbreviated as TBNet) for generic image manipulation localization.
The proposed TBNet can significantly outperform state-of-the-art generic image manipulation localization methods in terms of both MCC and F1.
arXiv Detail & Related papers (2021-08-10T08:22:05Z) - F-Drop&Match: GANs with a Dead Zone in the High-Frequency Domain [12.290010554180613]
We introduce two novel training techniques called frequency dropping (F-Drop) and frequency matching (F-Match)
F-Drop filters out unnecessary high-frequency components from the input images of the discriminators.
F-Match minimizes the difference between real and fake images in the frequency domain for generating more realistic images.
arXiv Detail & Related papers (2021-06-04T08:51:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.