S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and
Blind Super-Resolution
- URL: http://arxiv.org/abs/2308.08142v1
- Date: Wed, 16 Aug 2023 04:27:44 GMT
- Title: S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and
Blind Super-Resolution
- Authors: Minghao She, Wendong Mao, Huihong Shi and Zhongfeng Wang
- Abstract summary: A light-weight transformer-based SR model (S2R transformer) and a novel coarse-to-fine training strategy are proposed.
The proposed S2R outperforms other single-image SR models in ideal SR condition with only 578K parameters.
It can achieve better visual results than regular blind SR models in blind fuzzy conditions with only 10 gradient updates.
- Score: 5.617008573997855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, deep learning based methods have demonstrated impressive
performance on ideal super-resolution (SR) datasets, but most of these methods
incur dramatically performance drops when directly applied in real-world SR
reconstruction tasks with unpredictable blur kernels. To tackle this issue,
blind SR methods are proposed to improve the visual results on random blur
kernels, which causes unsatisfactory reconstruction effects on ideal
low-resolution images similarly. In this paper, we propose a double-win
framework for ideal and blind SR task, named S2R, including a light-weight
transformer-based SR model (S2R transformer) and a novel coarse-to-fine
training strategy, which can achieve excellent visual results on both ideal and
random fuzzy conditions. On algorithm level, S2R transformer smartly combines
some efficient and light-weight blocks to enhance the representation ability of
extracted features with relatively low number of parameters. For training
strategy, a coarse-level learning process is firstly performed to improve the
generalization of the network with the help of a large-scale external dataset,
and then, a fast fine-tune process is developed to transfer the pre-trained
model to real-world SR tasks by mining the internal features of the image.
Experimental results show that the proposed S2R outperforms other single-image
SR models in ideal SR condition with only 578K parameters. Meanwhile, it can
achieve better visual results than regular blind SR models in blind fuzzy
conditions with only 10 gradient updates, which improve convergence speed by
300 times, significantly accelerating the transfer-learning process in
real-world situations.
Related papers
- Efficient Test-Time Adaptation for Super-Resolution with Second-Order
Degradation and Reconstruction [62.955327005837475]
Image super-resolution (SR) aims to learn a mapping from low-resolution (LR) to high-resolution (HR) using paired HR-LR training images.
We present an efficient test-time adaptation framework for SR, named SRTTA, which is able to quickly adapt SR models to test domains with different/unknown degradation types.
arXiv Detail & Related papers (2023-10-29T13:58:57Z) - DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image
Super-Resolution [15.694407977871341]
Real-world image super-resolution (RISR) has received increased focus for improving the quality of SR images under unknown complex degradation.
Existing methods rely on the heavy SR models to enhance low-resolution (LR) images of different degradation levels.
We propose a novel Dynamic Channel Splitting scheme for efficient Real-world Image Super-Resolution, termed DCS-RISR.
arXiv Detail & Related papers (2022-12-15T04:34:57Z) - The Best of Both Worlds: a Framework for Combining Degradation
Prediction with High Performance Super-Resolution Networks [14.804000317612305]
We present a framework for combining blind SR prediction mechanism with any deep SR network.
We show that our hybrid models consistently achieve stronger SR performance than both their non-blind and blind counterparts.
arXiv Detail & Related papers (2022-11-09T16:49:35Z) - Blind Super-Resolution for Remote Sensing Images via Conditional
Stochastic Normalizing Flows [14.882417028542855]
We propose a novel blind SR framework based on the normalizing flow (BlindSRSNF) to address the above problems.
BlindSRSNF learns the conditional probability distribution over the high-resolution image space given a low-resolution (LR) image by explicitly optimizing the variational bound on the likelihood.
We show that the proposed algorithm can obtain SR results with excellent visual perception quality on both simulated LR and real-world RSIs.
arXiv Detail & Related papers (2022-10-14T12:37:32Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Real-World Image Super-Resolution by Exclusionary Dual-Learning [98.36096041099906]
Real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input.
Deep learning-based methods have achieved promising restoration quality on real-world image super-resolution datasets.
We propose Real-World image Super-Resolution by Exclusionary Dual-Learning (RWSR-EDL) to address the feature diversity in perceptual- and L1-based cooperative learning.
arXiv Detail & Related papers (2022-06-06T13:28:15Z) - DynaVSR: Dynamic Adaptive Blind Video Super-Resolution [60.154204107453914]
DynaVSR is a novel meta-learning-based framework for real-world video SR.
We train a multi-frame downscaling module with various types of synthetic blur kernels, which is seamlessly combined with a video SR network for input-aware adaptation.
Experimental results show that DynaVSR consistently improves the performance of the state-of-the-art video SR models by a large margin.
arXiv Detail & Related papers (2020-11-09T15:07:32Z) - Joint Generative Learning and Super-Resolution For Real-World
Camera-Screen Degradation [6.14297871633911]
In real-world single image super-resolution (SISR) task, the low-resolution image suffers more complicated degradations.
In this paper, we focus on the camera-screen degradation and build a real-world dataset (Cam-ScreenSR)
We propose a joint two-stage model. Firstly, the downsampling degradation GAN(DD-GAN) is trained to model the degradation and produces more various of LR images.
Then the dual residual channel attention network (DuRCAN) learns to recover the SR image.
arXiv Detail & Related papers (2020-08-01T07:10:13Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z) - Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery.
Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data.
This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution.
We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.