Uformer: A General U-Shaped Transformer for Image Restoration
- URL: http://arxiv.org/abs/2106.03106v1
- Date: Sun, 6 Jun 2021 12:33:22 GMT
- Title: Uformer: A General U-Shaped Transformer for Image Restoration
- Authors: Zhendong Wang, Xiaodong Cun, Jianmin Bao, Jianzhuang Liu
- Abstract summary: We build a hierarchical encoder-decoder network using the Transformer block for image restoration.
Experiments on several image restoration tasks demonstrate the superiority of Uformer.
- Score: 47.60420806106756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present Uformer, an effective and efficient
Transformer-based architecture, in which we build a hierarchical
encoder-decoder network using the Transformer block for image restoration.
Uformer has two core designs to make it suitable for this task. The first key
element is a local-enhanced window Transformer block, where we use
non-overlapping window-based self-attention to reduce the computational
requirement and employ the depth-wise convolution in the feed-forward network
to further improve its potential for capturing local context. The second key
element is that we explore three skip-connection schemes to effectively deliver
information from the encoder to the decoder. Powered by these two designs,
Uformer enjoys a high capability for capturing useful dependencies for image
restoration. Extensive experiments on several image restoration tasks
demonstrate the superiority of Uformer, including image denoising, deraining,
deblurring and demoireing. We expect that our work will encourage further
research to explore Transformer-based architectures for low-level vision tasks.
The code and models will be available at
https://github.com/ZhendongWang6/Uformer.
Related papers
- How Powerful Potential of Attention on Image Restoration? [97.9777639562205]
We conduct an empirical study to explore the potential of attention mechanisms without using FFN.
We propose Continuous Scaling Attention (textbfCSAttn), a method that computes attention continuously in three stages without using FFN.
Our designs provide a closer look at the attention mechanism and reveal that some simple operations can significantly affect the model performance.
arXiv Detail & Related papers (2024-03-15T14:23:12Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions [97.45751035126548]
We propose a novel transformer-based framework called GridFormer.
GridFormer serves as a backbone for image restoration under adverse weather conditions.
The framework achieves state-of-the-art results on five diverse image restoration tasks.
arXiv Detail & Related papers (2023-05-29T03:03:53Z) - Towards End-to-End Image Compression and Analysis with Transformers [99.50111380056043]
We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application.
We aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer.
Experimental results demonstrate the effectiveness of the proposed model in both the image compression and the classification tasks.
arXiv Detail & Related papers (2021-12-17T03:28:14Z) - U2-Former: A Nested U-shaped Transformer for Image Restoration [30.187257111046556]
We present a deep and effective Transformer-based network for image restoration, termed as U2-Former.
It is able to employ Transformer as the core operation to perform image restoration in a deep encoding and decoding space.
arXiv Detail & Related papers (2021-12-04T08:37:04Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion [37.993611194758195]
We propose a Patch PyramidTransformer(PPT) to address the issues of extracting semantic information from an image.
The experimental results demonstrate its superior performance against the state-of-the-art fusion approaches.
arXiv Detail & Related papers (2021-07-29T13:57:45Z) - ResT: An Efficient Transformer for Visual Recognition [5.807423409327807]
This paper presents an efficient multi-scale vision Transformer, called ResT, that capably served as a general-purpose backbone for image recognition.
We show that the proposed ResT can outperform the recently state-of-the-art backbones by a large margin, demonstrating the potential of ResT as strong backbones.
arXiv Detail & Related papers (2021-05-28T08:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.