DARTS: Double Attention Reference-based Transformer for Super-resolution
- URL: http://arxiv.org/abs/2307.08837v1
- Date: Mon, 17 Jul 2023 20:57:16 GMT
- Title: DARTS: Double Attention Reference-based Transformer for Super-resolution
- Authors: Masoomeh Aslahishahri, Jordan Ubbens, Ian Stavness
- Abstract summary: We present DARTS, a transformer model for reference-based image super-resolution.
DARS learns joint representations of two image distributions to enhance the content of low-resolution input images.
We show that our transformer-based model performs competitively with state-of-the-art models.
- Score: 12.424350934766704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present DARTS, a transformer model for reference-based image
super-resolution. DARTS learns joint representations of two image distributions
to enhance the content of low-resolution input images through matching
correspondences learned from high-resolution reference images. Current
state-of-the-art techniques in reference-based image super-resolution are based
on a multi-network, multi-stage architecture. In this work, we adapt the double
attention block from the GAN literature, processing the two visual streams
separately and combining self-attention and cross-attention blocks through a
gating attention strategy. Our work demonstrates how the attention mechanism
can be adapted for the particular requirements of reference-based image
super-resolution, significantly simplifying the architecture and training
pipeline. We show that our transformer-based model performs competitively with
state-of-the-art models, while maintaining a simpler overall architecture and
training process. In particular, we obtain state-of-the-art on the SUN80
dataset, with a PSNR/SSIM of 29.83 / .809. These results show that attention
alone is sufficient for the RSR task, without multiple purpose-built
subnetworks, knowledge distillation, or multi-stage training.
Related papers
- CWT-Net: Super-resolution of Histopathology Images Using a Cross-scale Wavelet-based Transformer [15.930878163092983]
Super-resolution (SR) aims to enhance the quality of low-resolution images and has been widely applied in medical imaging.
We propose a novel network called CWT-Net, which leverages cross-scale image wavelet transform and Transformer architecture.
Our model significantly outperforms state-of-the-art methods in both performance and visualization evaluations.
arXiv Detail & Related papers (2024-09-11T08:26:28Z) - HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution [6.546896650921257]
We propose HiTSR, a hierarchical transformer model for reference-based image super-resolution.
We streamline the architecture and training pipeline by incorporating the double attention block from GAN literature.
Our model demonstrates superior performance across three datasets including SUN80, Urban100, and Manga109.
arXiv Detail & Related papers (2024-08-30T01:16:29Z) - IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions [26.09373405194564]
We present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2.
We adopt a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields.
Our proposed IPT-V2 achieves state-of-the-art results on various image processing tasks, covering denoising, deblurring, deraining and obtains much better trade-off for performance and computational complexity than previous methods.
arXiv Detail & Related papers (2024-03-31T10:01:20Z) - PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with
Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people.
The task of underwater image enhancement (UIE) has also emerged as the times require.
In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN.
Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z) - Cross-View Hierarchy Network for Stereo Image Super-Resolution [14.574538513341277]
Stereo image super-resolution aims to improve the quality of high-resolution stereo image pairs by exploiting complementary information across views.
We propose a novel method, named Cross-View-Hierarchy Network for Stereo Image Super-Resolution (CVHSSR)
CVHSSR achieves the best stereo image super-resolution performance than other state-of-the-art methods while using fewer parameters.
arXiv Detail & Related papers (2023-04-13T03:11:30Z) - Reference-based Image and Video Super-Resolution via C2-Matching [100.0808130445653]
We propose C2-Matching, which performs explicit robust matching crossing transformation and resolution.
C2-Matching significantly outperforms state of the arts on the standard CUFED5 benchmark.
We also extend C2-Matching to Reference-based Video Super-Resolution task, where an image taken in a similar scene serves as the HR reference image.
arXiv Detail & Related papers (2022-12-19T16:15:02Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image
Generation [8.26410341981427]
The Dual Attention Generative Adversarial Network (DTGAN) can synthesize high-quality and semantically consistent images.
The proposed model introduces channel-aware and pixel-aware attention modules that can guide the generator to focus on text-relevant channels and pixels.
A new type of visual loss is utilized to enhance the image resolution by ensuring vivid shape and perceptually uniform color distributions of generated images.
arXiv Detail & Related papers (2020-11-05T08:57:15Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.