Large coordinate kernel attention network for lightweight image super-resolution
- URL: http://arxiv.org/abs/2405.09353v2
- Date: Fri, 30 Aug 2024 12:41:06 GMT
- Title: Large coordinate kernel attention network for lightweight image super-resolution
- Authors: Fangwei Hao, Jiesheng Wu, Haotian Lu, Ji Du, Jing Xu, Xiaoxuan Xu,
- Abstract summary: We propose the multi-scale blueprint separable convolutions (MBSConv) as highly efficient building block with multi-scale receptive field.
We also propose a large coordinate kernel attention (LCKA) module which decomposes the 2D convolutional kernels of the depth-wise convolutional layers in LKA into horizontal and vertical 1-D kernels.
- Score: 5.66935513638074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The multi-scale receptive field and large kernel attention (LKA) module have been shown to significantly improve performance in the lightweight image super-resolution task. However, existing lightweight super-resolution (SR) methods seldom pay attention to designing efficient building block with multi-scale receptive field for local modeling, and their LKA modules face a quadratic increase in computational and memory footprints as the convolutional kernel size increases. To address the first issue, we propose the multi-scale blueprint separable convolutions (MBSConv) as highly efficient building block with multi-scale receptive field, it can focus on the learning for the multi-scale information which is a vital component of discriminative representation. As for the second issue, we revisit the key properties of LKA in which we find that the adjacent direct interaction of local information and long-distance dependencies is crucial to provide remarkable performance. Thus, taking this into account and in order to mitigate the complexity of LKA, we propose a large coordinate kernel attention (LCKA) module which decomposes the 2D convolutional kernels of the depth-wise convolutional layers in LKA into horizontal and vertical 1-D kernels. LCKA enables the adjacent direct interaction of local information and long-distance dependencies not only in the horizontal direction but also in the vertical. Besides, LCKA allows for the direct use of extremely large kernels in the depth-wise convolutional layers to capture more contextual information, which helps to significantly improve the reconstruction performance, and it incurs lower computational complexity and memory footprints. Integrating MBSConv and LCKA, we propose a large coordinate kernel attention network (LCAN).
Related papers
- Efficient Learnable Collaborative Attention for Single Image Super-Resolution [18.955369476815136]
Non-Local Attention (NLA) is a powerful technique for capturing long-range feature correlations in deep single image super-resolution (SR)
We propose a novel Learnable Collaborative Attention (LCoA) that introduces inductive bias into non-local modeling.
Our LCoA can reduce the non-local modeling time by about 83% in the inference stage.
arXiv Detail & Related papers (2024-04-07T11:25:04Z) - Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring [0.0]
Image deblurring aims to restore a high-quality image from its corresponding blurred.
We propose an efficient image deblurring network that leverages selective state spaces model to aggregate enriched and accurate features.
Experimental results demonstrate that the proposed method outperforms state-of-the-art approaches on widely used benchmarks.
arXiv Detail & Related papers (2024-03-29T10:40:41Z) - Low-Resolution Self-Attention for Semantic Segmentation [96.81482872022237]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost.
Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution.
We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z) - Large Separable Kernel Attention: Rethinking the Large Kernel Attention
Design in CNN [16.751500508997264]
We propose a family of Large Separable Kernel Attention modules, termed LSKA.
LSKA decomposes the 2D convolutional kernel of the depth-wise convolutional layer into cascaded horizontal and vertical 1-D kernels.
We demonstrate that the proposed LSKA design biases the VAN more toward the shape of the object than the texture with increasing kernel size.
arXiv Detail & Related papers (2023-09-04T08:38:11Z) - Omni Aggregation Networks for Lightweight Image Super-Resolution [42.252518645833696]
This work proposes two enhanced components under a new Omni-SR architecture.
First, an Omni Self-Attention (OSA) block is proposed based on dense interaction principle.
Second, a multi-scale interaction scheme is proposed to mitigate sub-optimal ERF.
arXiv Detail & Related papers (2023-04-20T12:05:14Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - Efficient Non-Local Contrastive Attention for Image Super-Resolution [48.093500219958834]
Non-Local Attention (NLA) brings significant improvement for Single Image Super-Resolution (SISR) by leveraging intrinsic feature correlation in natural images.
We propose a novel Efficient Non-Local Contrastive Attention (ENLCA) to perform long-range visual modeling and leverage more relevant non-local features.
arXiv Detail & Related papers (2022-01-11T05:59:09Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Sequential Hierarchical Learning with Distribution Transformation for
Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information.
Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.