Task-Aware Dynamic Transformer for Efficient Arbitrary-Scale Image Super-Resolution
- URL: http://arxiv.org/abs/2408.08736v2
- Date: Sun, 25 Aug 2024 12:00:05 GMT
- Title: Task-Aware Dynamic Transformer for Efficient Arbitrary-Scale Image Super-Resolution
- Authors: Tianyi Xu, Yiji Zhou, Xiaotao Hu, Kai Zhang, Anran Zhang, Xingye Qiu, Jun Xu,
- Abstract summary: Arbitrary-scale super-resolution (ASSR) aims to learn a single model for image super-resolution at arbitrary magnifying scales.
Existing ASSR networks typically comprise an off-the-shelf scale-agnostic feature extractor and an arbitrary scale upsampler.
We propose a Task-Aware Dynamic Transformer (TADT) as an input-adaptive feature extractor for efficient image ASSR.
- Score: 8.78015409192613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Arbitrary-scale super-resolution (ASSR) aims to learn a single model for image super-resolution at arbitrary magnifying scales. Existing ASSR networks typically comprise an off-the-shelf scale-agnostic feature extractor and an arbitrary scale upsampler. These feature extractors often use fixed network architectures to address different ASSR inference tasks, each of which is characterized by an input image and an upsampling scale. However, this overlooks the difficulty variance of super-resolution on different inference scenarios, where simple images or small SR scales could be resolved with less computational effort than difficult images or large SR scales. To tackle this difficulty variability, in this paper, we propose a Task-Aware Dynamic Transformer (TADT) as an input-adaptive feature extractor for efficient image ASSR. Our TADT consists of a multi-scale feature extraction backbone built upon groups of Multi-Scale Transformer Blocks (MSTBs) and a Task-Aware Routing Controller (TARC). The TARC predicts the inference paths within feature extraction backbone, specifically selecting MSTBs based on the input images and SR scales. The prediction of inference path is guided by a new loss function to trade-off the SR accuracy and efficiency. Experiments demonstrate that, when working with three popular arbitrary-scale upsamplers, our TADT achieves state-of-the-art ASSR performance when compared with mainstream feature extractors, but with relatively fewer computational costs. The code will be publicly released.
Related papers
- Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - Contextual Learning in Fourier Complex Field for VHR Remote Sensing
Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels)
We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA.
By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z) - ITSRN++: Stronger and Better Implicit Transformer Network for Continuous
Screen Content Image Super-Resolution [32.441761727608856]
The proposed method achieves state-of-the-art performance for SCI SR (outperforming SwinIR by 0.74 dB for x3 SR) and also works well for natural image SR.
We construct a large scale SCI2K dataset to facilitate the research on SCI SR.
arXiv Detail & Related papers (2022-10-17T07:47:34Z) - Lightweight Stepless Super-Resolution of Remote Sensing Images via
Saliency-Aware Dynamic Routing Strategy [15.587621728422414]
Deep learning algorithms have greatly improved the performance of remote sensing image (RSI) super-resolution (SR)
However, increasing network depth and parameters cause a huge burden of computing and storage.
We propose a saliency-aware dynamic routing network (SalDRN) for lightweight and stepless SR of RSIs.
arXiv Detail & Related papers (2022-10-14T07:49:03Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Efficient Long-Range Attention Network for Image Super-resolution [25.51377161557467]
We propose an efficient long-range attention network (ELAN) for image super-resolution (SR)
We first employ shift convolution (shift-conv) to effectively extract the image local structural information while maintaining the same level of complexity as 1x1 convolution.
A highly efficient long-range attention block (ELAB) is then built by simply cascading two shift-conv with a GMSA module.
arXiv Detail & Related papers (2022-03-13T16:17:48Z) - Scale-Aware Dynamic Network for Continuous-Scale Super-Resolution [16.67263192454279]
We propose a scale-aware dynamic network (SADN) for continuous-scale SR.
First, we propose a scale-aware dynamic convolutional (SAD-Conv) layer for the feature learning of multiple SR tasks with various scales.
Second, we devise a continuous-scale upsampling module (CSUM) with the multi-bilinear local implicit function (MBLIF) for any-scale upsampling.
arXiv Detail & Related papers (2021-10-29T09:57:48Z) - SRWarp: Generalized Image Super-Resolution under Arbitrary
Transformation [65.88321755969677]
Deep CNNs have achieved significant successes in image processing and its applications, including single image super-resolution.
Recent approaches extend the scope to real-valued upsampling factors.
We propose the SRWarp framework to further generalize the SR tasks toward an arbitrary image transformation.
arXiv Detail & Related papers (2021-04-21T02:50:41Z) - ASDN: A Deep Convolutional Network for Arbitrary Scale Image
Super-Resolution [6.672537252929814]
Image viewer applications commonly allow users to zoom the images to arbitrary magnification scales.
This paper employs a Laplacian pyramid method to reconstruct any-scale high-resolution (HR) images using the high-frequency image details.
arXiv Detail & Related papers (2020-10-06T01:18:46Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.