ITSRN++: Stronger and Better Implicit Transformer Network for Continuous
Screen Content Image Super-Resolution
- URL: http://arxiv.org/abs/2210.08812v1
- Date: Mon, 17 Oct 2022 07:47:34 GMT
- Title: ITSRN++: Stronger and Better Implicit Transformer Network for Continuous
Screen Content Image Super-Resolution
- Authors: Sheng Shen, Huanjing Yue, Jingyu Yang, Kun Li
- Abstract summary: The proposed method achieves state-of-the-art performance for SCI SR (outperforming SwinIR by 0.74 dB for x3 SR) and also works well for natural image SR.
We construct a large scale SCI2K dataset to facilitate the research on SCI SR.
- Score: 32.441761727608856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, online screen sharing and remote cooperation are becoming
ubiquitous. However, the screen content may be downsampled and compressed
during transmission, while it may be displayed on large screens or the users
would zoom in for detail observation at the receiver side. Therefore,
developing a strong and effective screen content image (SCI) super-resolution
(SR) method is demanded. We observe that the weight-sharing upsampler (such as
deconvolution or pixel shuffle) could be harmful to sharp and thin edges in
SCIs, and the fixed scale upsampler makes it inflexible to fit screens with
various sizes. To solve this problem, we propose an implicit transformer
network for continuous SCI SR (termed as ITSRN++). Specifically, we propose a
modulation based transformer as the upsampler, which modulates the pixel
features in discrete space via a periodic nonlinear function to generate
features for continuous pixels. To enhance the extracted features, we further
propose an enhanced transformer as the feature extraction backbone, where
convolution and attention branches are utilized parallelly. Besides, we
construct a large scale SCI2K dataset to facilitate the research on SCI SR.
Experimental results on nine datasets demonstrate that the proposed method
achieves state-of-the-art performance for SCI SR (outperforming SwinIR by 0.74
dB for x3 SR) and also works well for natural image SR. Our codes and dataset
will be released upon the acceptance of this work.
Related papers
- Task-Aware Dynamic Transformer for Efficient Arbitrary-Scale Image Super-Resolution [8.78015409192613]
Arbitrary-scale super-resolution (ASSR) aims to learn a single model for image super-resolution at arbitrary magnifying scales.
Existing ASSR networks typically comprise an off-the-shelf scale-agnostic feature extractor and an arbitrary scale upsampler.
We propose a Task-Aware Dynamic Transformer (TADT) as an input-adaptive feature extractor for efficient image ASSR.
arXiv Detail & Related papers (2024-08-16T13:35:52Z) - HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution [70.52256118833583]
We present a strategy to convert transformer-based SR networks to hierarchical transformers (HiT-SR)
Specifically, we first replace the commonly used fixed small windows with expanding hierarchical windows to aggregate features at different scales.
Considering the intensive computation required for large windows, we further design a spatial-channel correlation method with linear complexity to window sizes.
arXiv Detail & Related papers (2024-07-08T12:42:10Z) - LIPT: Latency-aware Image Processing Transformer [17.802838753201385]
We present a latency-aware image processing transformer, termed LIPT.
We devise the low-latency proportion LIPT block that substitutes memory-intensive operators with the combination of self-attention and convolutions to achieve practical speedup.
arXiv Detail & Related papers (2024-04-09T07:25:30Z) - Dual Aggregation Transformer for Image Super-Resolution [92.41781921611646]
We propose a novel Transformer model, Dual Aggregation Transformer, for image SR.
Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner.
Our experiments show that our DAT surpasses current methods.
arXiv Detail & Related papers (2023-08-07T07:39:39Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - Contextual Learning in Fourier Complex Field for VHR Remote Sensing
Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels)
We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA.
By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z) - Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR)
Most deep CNN-based SR models take massive computations to obtain high performance.
We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Implicit Transformer Network for Screen Content Image Continuous
Super-Resolution [27.28782217250359]
High-resolution (HR) screen contents may be downsampled and compressed.
Super-resolution (SR) of low-resolution (LR) screen content images (SCIs) is highly demanded by the HR display or by the users to zoom in for detail observation.
We propose a novel Implicit Transformer Super-Resolution Network (ITSRN) for SCISR.
arXiv Detail & Related papers (2021-12-12T07:39:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.