Image Deblurring by Exploring In-depth Properties of Transformer
- URL: http://arxiv.org/abs/2303.15198v2
- Date: Sat, 27 Jan 2024 05:47:40 GMT
- Title: Image Deblurring by Exploring In-depth Properties of Transformer
- Authors: Pengwei Liang, Junjun Jiang, Xianming Liu, Jiayi Ma
- Abstract summary: We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
- Score: 86.7039249037193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image deblurring continues to achieve impressive performance with the
development of generative models. Nonetheless, there still remains a
displeasing problem if one wants to improve perceptual quality and quantitative
scores of recovered image at the same time. In this study, drawing inspiration
from the research of transformer properties, we introduce the pretrained
transformers to address this problem. In particular, we leverage deep features
extracted from a pretrained vision transformer (ViT) to encourage recovered
images to be sharp without sacrificing the performance measured by the
quantitative metrics. The pretrained transformer can capture the global
topological relations (i.e., self-similarity) of image, and we observe that the
captured topological relations about the sharp image will change when blur
occurs. By comparing the transformer features between recovered image and
target one, the pretrained transformer provides high-resolution blur-sensitive
semantic information, which is critical in measuring the sharpness of the
deblurred image. On the basis of the advantages, we present two types of novel
perceptual losses to guide image deblurring. One regards the features as
vectors and computes the discrepancy between representations extracted from
recovered image and target one in Euclidean space. The other type considers the
features extracted from an image as a distribution and compares the
distribution discrepancy between recovered image and target one. We demonstrate
the effectiveness of transformer properties in improving the perceptual quality
while not sacrificing the quantitative scores (PSNR) over the most competitive
models, such as Uformer, Restormer, and NAFNet, on defocus deblurring and
motion deblurring tasks.
Related papers
- SwinStyleformer is a favorable choice for image inversion [2.8115030277940947]
This paper proposes the first pure Transformer structure inversion network called SwinStyleformer.
Experiments found that the inversion network with the Transformer backbone could not successfully invert the image.
arXiv Detail & Related papers (2024-06-19T02:08:45Z) - Blind Image Quality Assessment via Transformer Predicted Error Map and
Perceptual Quality Token [19.67014524146261]
No-reference image quality assessment (NR-IQA) has gained increasing attention recently.
We propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token.
Our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases.
arXiv Detail & Related papers (2023-05-16T11:17:54Z) - Invertible Rescaling Network and Its Extensions [118.72015270085535]
In this work, we propose a novel invertible framework to model the bidirectional degradation and restoration from a new perspective.
We develop invertible models to generate valid degraded images and transform the distribution of lost contents.
Then restoration is made tractable by applying the inverse transformation on the generated degraded image together with a randomly-drawn latent variable.
arXiv Detail & Related papers (2022-10-09T06:58:58Z) - Towards End-to-End Image Compression and Analysis with Transformers [99.50111380056043]
We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application.
We aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer.
Experimental results demonstrate the effectiveness of the proposed model in both the image compression and the classification tasks.
arXiv Detail & Related papers (2021-12-17T03:28:14Z) - AdaViT: Adaptive Vision Transformers for Efficient Image Recognition [78.07924262215181]
We introduce AdaViT, an adaptive framework that learns to derive usage policies on which patches, self-attention heads and transformer blocks to use.
Our method obtains more than 2x improvement on efficiency compared to state-of-the-art vision transformers with only 0.8% drop of accuracy.
arXiv Detail & Related papers (2021-11-30T18:57:02Z) - Efficient Vision Transformers via Fine-Grained Manifold Distillation [96.50513363752836]
Vision transformer architectures have shown extraordinary performance on many computer vision tasks.
Although the network performance is boosted, transformers are often required more computational resources.
We propose to excavate useful information from the teacher transformer through the relationship between images and the divided patches.
arXiv Detail & Related papers (2021-07-03T08:28:34Z) - Training Vision Transformers for Image Retrieval [32.09708181236154]
We adopt vision transformers for generating image descriptors and train the resulting model with a metric learning objective.
Our results show consistent and significant improvements of transformers over convolution-based approaches.
arXiv Detail & Related papers (2021-02-10T18:56:41Z) - Invertible Image Rescaling [118.2653765756915]
We develop an Invertible Rescaling Net (IRN) to produce visually-pleasing low-resolution images.
We capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process.
arXiv Detail & Related papers (2020-05-12T09:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.