U2-Former: A Nested U-shaped Transformer for Image Restoration
- URL: http://arxiv.org/abs/2112.02279v2
- Date: Wed, 8 Dec 2021 12:09:20 GMT
- Title: U2-Former: A Nested U-shaped Transformer for Image Restoration
- Authors: Haobo Ji, Xin Feng, Wenjie Pei, Jinxing Li, Guangming Lu
- Abstract summary: We present a deep and effective Transformer-based network for image restoration, termed as U2-Former.
It is able to employ Transformer as the core operation to perform image restoration in a deep encoding and decoding space.
- Score: 30.187257111046556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Transformer has achieved remarkable performance in various high-level
vision tasks, it is still challenging to exploit the full potential of
Transformer in image restoration. The crux lies in the limited depth of
applying Transformer in the typical encoder-decoder framework for image
restoration, resulting from heavy self-attention computation load and
inefficient communications across different depth (scales) of layers. In this
paper, we present a deep and effective Transformer-based network for image
restoration, termed as U2-Former, which is able to employ Transformer as the
core operation to perform image restoration in a deep encoding and decoding
space. Specifically, it leverages the nested U-shaped structure to facilitate
the interactions across different layers with different scales of feature maps.
Furthermore, we optimize the computational efficiency for the basic Transformer
block by introducing a feature-filtering mechanism to compress the token
representation. Apart from the typical supervision ways for image restoration,
our U2-Former also performs contrastive learning in multiple aspects to further
decouple the noise component from the background image. Extensive experiments
on various image restoration tasks, including reflection removal, rain streak
removal and dehazing respectively, demonstrate the effectiveness of the
proposed U2-Former.
Related papers
- Joint multi-dimensional dynamic attention and transformer for general image restoration [14.987034136856463]
outdoor images often suffer from severe degradation due to rain, haze, and noise.
Current image restoration methods struggle to handle complex degradation while maintaining efficiency.
This paper introduces a novel image restoration architecture that combines multi-dimensional dynamic attention and self-attention.
arXiv Detail & Related papers (2024-11-12T15:58:09Z) - Segmentation Guided Sparse Transformer for Under-Display Camera Image
Restoration [91.65248635837145]
Under-Display Camera (UDC) is an emerging technology that achieves full-screen display via hiding the camera under the display panel.
In this paper, we observe that when using the Vision Transformer for UDC degraded image restoration, the global attention samples a large amount of redundant information and noise.
We propose a Guided Sparse Transformer method (SGSFormer) for the task of restoring high-quality images from UDC degraded images.
arXiv Detail & Related papers (2024-03-09T13:11:59Z) - GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions [97.45751035126548]
We propose a novel transformer-based framework called GridFormer.
GridFormer serves as a backbone for image restoration under adverse weather conditions.
The framework achieves state-of-the-art results on five diverse image restoration tasks.
arXiv Detail & Related papers (2023-05-29T03:03:53Z) - Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z) - Towards End-to-End Image Compression and Analysis with Transformers [99.50111380056043]
We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application.
We aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer.
Experimental results demonstrate the effectiveness of the proposed model in both the image compression and the classification tasks.
arXiv Detail & Related papers (2021-12-17T03:28:14Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion [37.993611194758195]
We propose a Patch PyramidTransformer(PPT) to address the issues of extracting semantic information from an image.
The experimental results demonstrate its superior performance against the state-of-the-art fusion approaches.
arXiv Detail & Related papers (2021-07-29T13:57:45Z) - Uformer: A General U-Shaped Transformer for Image Restoration [47.60420806106756]
We build a hierarchical encoder-decoder network using the Transformer block for image restoration.
Experiments on several image restoration tasks demonstrate the superiority of Uformer.
arXiv Detail & Related papers (2021-06-06T12:33:22Z) - ResT: An Efficient Transformer for Visual Recognition [5.807423409327807]
This paper presents an efficient multi-scale vision Transformer, called ResT, that capably served as a general-purpose backbone for image recognition.
We show that the proposed ResT can outperform the recently state-of-the-art backbones by a large margin, demonstrating the potential of ResT as strong backbones.
arXiv Detail & Related papers (2021-05-28T08:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.