CubeFormer: A Simple yet Effective Baseline for Lightweight Image Super-Resolution
- URL: http://arxiv.org/abs/2412.02234v1
- Date: Tue, 03 Dec 2024 08:02:26 GMT
- Title: CubeFormer: A Simple yet Effective Baseline for Lightweight Image Super-Resolution
- Authors: Jikai Wang, Huan Zheng, Jianbing Shen,
- Abstract summary: Lightweight image super-resolution (SR) methods aim at increasing the resolution and restoring the details of an image using a lightweight neural network.
Our analysis reveals that these methods are hindered by constrained feature diversity, which adversely impacts feature representation and detail recovery.
We propose a simple yet effective baseline called CubeFormer, designed to enhance feature richness by completing holistic information aggregation.
- Score: 55.94314421887744
- License:
- Abstract: Lightweight image super-resolution (SR) methods aim at increasing the resolution and restoring the details of an image using a lightweight neural network. However, current lightweight SR methods still suffer from inferior performance and unpleasant details. Our analysis reveals that these methods are hindered by constrained feature diversity, which adversely impacts feature representation and detail recovery. To respond this issue, we propose a simple yet effective baseline called CubeFormer, designed to enhance feature richness by completing holistic information aggregation. To be specific, we introduce cube attention, which expands 2D attention to 3D space, facilitating exhaustive information interactions, further encouraging comprehensive information extraction and promoting feature variety. In addition, we inject block and grid sampling strategies to construct intra-cube transformer blocks (Intra-CTB) and inter-cube transformer blocks (Inter-CTB), which perform local and global modeling, respectively. Extensive experiments show that our CubeFormer achieves state-of-the-art performance on commonly used SR benchmarks. Our source code and models will be publicly available.
Related papers
- Robust Hyperspectral Image Panshapring via Sparse Spatial-Spectral Representation [9.3350274016294]
S$3$RNet is a novel framework for hyperspectral image pansharpening.
It combines low-resolution hyperspectral images (LRHSI) with high-resolution multispectral images (HRMSI) through sparse spatial-spectral representation.
S$3$RNet achieves state-of-the-art performance across multiple evaluation metrics.
arXiv Detail & Related papers (2025-01-14T09:09:14Z) - Low-Resolution Self-Attention for Semantic Segmentation [93.30597515880079]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost.
Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution.
We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z) - Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image
Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising.
We propose rank-enhanced low-dimensional convolution set (Re-ConvSet)
We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.