Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution
- URL: http://arxiv.org/abs/2207.02796v2
- Date: Sat, 29 Apr 2023 07:37:39 GMT
- Title: Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution
- Authors: Wenjie Li, Juncheng Li, Guangwei Gao, Jiantao Zhou, Jian Yang, and
Guo-Jun Qi
- Abstract summary: Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
- Score: 64.25751738088015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, Transformer-based methods have shown impressive performance in
single image super-resolution (SISR) tasks due to the ability of global feature
extraction. However, the capabilities of Transformers that need to incorporate
contextual information to extract features dynamically are neglected. To
address this issue, we propose a lightweight Cross-receptive Focused Inference
Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and
Transformer. Specifically, in the CT block, we first propose a CNN-based
Cross-Scale Information Aggregation Module (CIAM) to enable the model to better
focus on potentially helpful information to improve the efficiency of the
Transformer phase. Then, we design a novel Cross-receptive Field Guided
Transformer (CFGT) to enable the selection of contextual information required
for reconstruction by using a modulated convolutional kernel that understands
the current semantic information and exploits the information interaction
within different self-attention. Extensive experiments have shown that our
proposed CFIN can effectively reconstruct images using contextual information,
and it can strike a good balance between computational cost and model
performance as an efficient model. Source codes will be available at
https://github.com/IVIPLab/CFIN.
Related papers
- UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration [4.068692674719378]
Complicated image registration is a key issue in medical image analysis.
We propose a novel unsupervised image registration method named the unified Transformer and superresolution (UTSRMorph) network.
arXiv Detail & Related papers (2024-10-27T06:28:43Z) - DRCT: Saving Image Super-resolution away from Information Bottleneck [7.765333471208582]
Vision Transformer-based approaches for low-level vision tasks have achieved widespread success.
Dense-residual-connected Transformer (DRCT) is proposed to mitigate the loss of spatial information.
Our approach surpasses state-of-the-art methods on benchmark datasets.
arXiv Detail & Related papers (2024-03-31T15:34:45Z) - How Powerful Potential of Attention on Image Restoration? [97.9777639562205]
We conduct an empirical study to explore the potential of attention mechanisms without using FFN.
We propose Continuous Scaling Attention (textbfCSAttn), a method that computes attention continuously in three stages without using FFN.
Our designs provide a closer look at the attention mechanism and reveal that some simple operations can significantly affect the model performance.
arXiv Detail & Related papers (2024-03-15T14:23:12Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based
Transformer Network for Remote Sensing Image Super-Resolution [13.894645293832044]
Transformer-based models have shown competitive performance in remote sensing image super-resolution (RSISR)
We propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR.
Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages.
arXiv Detail & Related papers (2023-07-06T13:19:06Z) - Activating More Pixels in Image Super-Resolution Transformer [53.87533738125943]
Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution.
We propose a novel Hybrid Attention Transformer (HAT) to activate more input pixels for better reconstruction.
Our overall method significantly outperforms the state-of-the-art methods by more than 1dB.
arXiv Detail & Related papers (2022-05-09T17:36:58Z) - Lightweight Bimodal Network for Single-Image Super-Resolution via
Symmetric CNN and Recursive Transformer [27.51790638626891]
Single-image super-resolution (SISR) has achieved significant breakthroughs with the development of deep learning.
To solve this issue, we propose a Lightweight Bimodal Network (LBNet) for SISR.
Specifically, an effective Symmetric CNN is designed for local feature extraction and coarse image reconstruction.
arXiv Detail & Related papers (2022-04-28T04:43:22Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.