Related papers: Image Super-Resolution using Efficient Striped Window Transformer

Image Super-Resolution using Efficient Striped Window Transformer

URL: http://arxiv.org/abs/2301.09869v1
Date: Tue, 24 Jan 2023 09:09:35 GMT
Title: Image Super-Resolution using Efficient Striped Window Transformer
Authors: Jinpeng Shi, Hui Li, Tianle Liu, Yulong Liu, Mingjian Zhang, Jinchen Zhu, Ling Zheng, Shizhuang Weng
Abstract summary: In this paper, we propose an efficient striped window transformer (ESWT) ESWT consists of efficient transformation layers (ETLs), allowing a clean structure and avoiding redundant operations. To further exploit the potential of the transformer, we propose a novel flexible window training strategy.
Score: 6.815956004383743
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, transformer-based methods have made impressive progress in single-image super-resolu-tion (SR). However, these methods are difficult to apply to lightweight SR (LSR) due to the challenge of balancing model performance and complexity. In this paper, we propose an efficient striped window transformer (ESWT). ESWT consists of efficient transformation layers (ETLs), allowing a clean structure and avoiding redundant operations. Moreover, we designed a striped window mechanism to obtain a more efficient ESWT in modeling long-term dependencies. To further exploit the potential of the transformer, we propose a novel flexible window training strategy. Without any additional cost, this strategy can further improve the performance of ESWT. Extensive experiments show that the proposed method outperforms state-of-the-art transformer-based LSR methods with fewer parameters, faster inference, smaller FLOPs, and less memory consumption, achieving a better trade-off between model performance and complexity.

Related papers

Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z)
Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution [23.265907475054156]
Transformer-based Super-Resolution (SR) methods have demonstrated superior performance compared to convolutional neural network (CNN)-based SR approaches. We propose a lightweight SR network that integrates attention-sharing and an information distillation structure specifically designed for Transformer-based SR methods.
arXiv Detail & Related papers (2025-01-27T04:46:58Z)
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition [10.302458835329539]
We introduce a new method, namely Transformer Re- parameterization, to boost the performance of lightweight Transformer models. Experimental results show that our proposed method consistently improves the performance of lightweight Transformers, even making them comparable to large models.
arXiv Detail & Related papers (2024-11-14T10:36:19Z)
Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR) In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks. We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z)
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution [70.52256118833583]
We present a strategy to convert transformer-based SR networks to hierarchical transformers (HiT-SR) Specifically, we first replace the commonly used fixed small windows with expanding hierarchical windows to aggregate features at different scales. Considering the intensive computation required for large windows, we further design a spatial-channel correlation method with linear complexity to window sizes.
arXiv Detail & Related papers (2024-07-08T12:42:10Z)
Linearly-evolved Transformer for Pan-sharpening [34.06189165260206]
Vision transformer family has dominated the satellite pan-sharpening field driven by the global-wise spatial information modeling mechanism. Standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. We propose an efficient linearly-evolved transformer variant and employ it to construct a lightweight pan-sharpening framework.
arXiv Detail & Related papers (2024-04-19T11:38:34Z)
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation. DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z)
Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR) CFSR inherits the advantages of both convolution-based and transformer-based approaches. Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z)
Unfolding Once is Enough: A Deployment-Friendly Transformer Unit for Super-Resolution [16.54421804141835]
High resolution of intermediate features in SISR models increases memory and computational requirements. We propose a Deployment-friendly Inner-patch Transformer Network (DITN) for the SISR task. Our models can achieve competitive results in terms of qualitative and quantitative performance with high deployment efficiency.
arXiv Detail & Related papers (2023-08-05T05:42:51Z)
RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs. We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z)
Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks. We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.