PTSR: Patch Translator for Image Super-Resolution
- URL: http://arxiv.org/abs/2310.13216v1
- Date: Fri, 20 Oct 2023 01:45:00 GMT
- Title: PTSR: Patch Translator for Image Super-Resolution
- Authors: Neeraj Baghel, Shiv Ram Dubey, Satish Kumar Singh
- Abstract summary: We propose a patch translator for image super-resolution (PTSR) to address this problem.
The proposed PTSR is a transformer-based GAN network with no convolution operation.
We introduce a novel patch translator module for regenerating the improved patches utilising multi-head attention.
- Score: 16.243363392717434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image super-resolution generation aims to generate a high-resolution image
from its low-resolution image. However, more complex neural networks bring high
computational costs and memory storage. It is still an active area for offering
the promise of overcoming resolution limitations in many applications. In
recent years, transformers have made significant progress in computer vision
tasks as their robust self-attention mechanism. However, recent works on the
transformer for image super-resolution also contain convolution operations. We
propose a patch translator for image super-resolution (PTSR) to address this
problem. The proposed PTSR is a transformer-based GAN network with no
convolution operation. We introduce a novel patch translator module for
regenerating the improved patches utilising multi-head attention, which is
further utilised by the generator to generate the 2x and 4x super-resolution
images. The experiments are performed using benchmark datasets, including
DIV2K, Set5, Set14, and BSD100. The results of the proposed model is improved
on an average for $4\times$ super-resolution by 21.66% in PNSR score and 11.59%
in SSIM score, as compared to the best competitive models. We also analyse the
proposed loss and saliency map to show the effectiveness of the proposed
method.
Related papers
- A Low-Resolution Image is Worth 1x1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift [6.835244697120131]
We propose TaylorIR to address limitations by utilizing a patch size of 1x1, enabling pixel-level processing in any transformer-based SR model.
Experimental results demonstrate that our approach achieves new state-of-the-art SR performance while reducing memory consumption by up to 60% compared to traditional self-attention-based transformers.
arXiv Detail & Related papers (2024-11-15T14:43:58Z) - SRTransGAN: Image Super-Resolution using Transformer based Generative
Adversarial Network [16.243363392717434]
We propose a transformer-based encoder-decoder network as a generator to generate 2x images and 4x images.
The proposed SRTransGAN outperforms the existing methods by 4.38 % on an average of PSNR and SSIM scores.
arXiv Detail & Related papers (2023-12-04T16:22:39Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - Image Reconstruction using Enhanced Vision Transformer [0.08594140167290097]
We propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting.
The model proposed in this project is based on Vision Transformer (ViT) that takes 2D images as input and outputs embeddings.
We incorporate four additional optimization techniques in the framework to improve the model reconstruction capability.
arXiv Detail & Related papers (2023-07-11T02:14:18Z) - LSwinSR: UAV Imagery Super-Resolution based on Linear Swin Transformer [7.3817359680010615]
Super-resolution technology is especially beneficial for Unmanned Aerial Vehicles (UAV)
In this paper, for the super-resolution of UAV images, a novel network based on the state-of-the-art Swin Transformer is proposed with better efficiency and competitive accuracy.
arXiv Detail & Related papers (2023-03-17T20:14:10Z) - Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR)
Most deep CNN-based SR models take massive computations to obtain high performance.
We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z) - AdaViT: Adaptive Vision Transformers for Efficient Image Recognition [78.07924262215181]
We introduce AdaViT, an adaptive framework that learns to derive usage policies on which patches, self-attention heads and transformer blocks to use.
Our method obtains more than 2x improvement on efficiency compared to state-of-the-art vision transformers with only 0.8% drop of accuracy.
arXiv Detail & Related papers (2021-11-30T18:57:02Z) - HRFormer: High-Resolution Transformer for Dense Prediction [99.6060997466614]
We present a High-Resolution Transformer (HRFormer) that learns high-resolution representations for dense prediction tasks.
We take advantage of the multi-resolution parallel design introduced in high-resolution convolutional networks (HRNet)
We demonstrate the effectiveness of the High-Resolution Transformer on both human pose estimation and semantic segmentation tasks.
arXiv Detail & Related papers (2021-10-18T15:37:58Z) - Generating Superpixels for High-resolution Images with Decoupled Patch
Calibration [82.21559299694555]
Patch Networks (PCNet) is designed to efficiently and accurately implement high-resolution superpixel segmentation.
DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
In particular, DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries.
arXiv Detail & Related papers (2021-08-19T10:33:05Z) - Improved Transformer for High-Resolution GANs [69.42469272015481]
We introduce two key ingredients to Transformer to address this challenge.
We show in the experiments that the proposed HiT achieves state-of-the-art FID scores of 31.87 and 2.95 on unconditional ImageNet $128 times 128$ and FFHQ $256 times 256$, respectively.
arXiv Detail & Related papers (2021-06-14T17:39:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.