GLMHA A Guided Low-rank Multi-Head Self-Attention for Efficient Image Restoration and Spectral Reconstruction
- URL: http://arxiv.org/abs/2410.00380v1
- Date: Tue, 1 Oct 2024 04:07:48 GMT
- Title: GLMHA A Guided Low-rank Multi-Head Self-Attention for Efficient Image Restoration and Spectral Reconstruction
- Authors: Zaid Ilyas, Naveed Akhtar, David Suter, Syed Zulqarnain Gilani,
- Abstract summary: We propose an instance-Guided Low-rank Multi-Head selfattention to replace the Channel-wise Self-Attention.
Unique to the proposed GLMHA is its ability to provide computational gain for both short and long input sequences.
Our results show up to a 7.7 Giga FLOPs reduction with 370K fewer parameters required to closely retain the original performance of the best-performing models.
- Score: 36.23508672036131
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Image restoration and spectral reconstruction are longstanding computer vision tasks. Currently, CNN-transformer hybrid models provide state-of-the-art performance for these tasks. The key common ingredient in the architectural designs of these models is Channel-wise Self-Attention (CSA). We first show that CSA is an overall low-rank operation. Then, we propose an instance-Guided Low-rank Multi-Head selfattention (GLMHA) to replace the CSA for a considerable computational gain while closely retaining the original model performance. Unique to the proposed GLMHA is its ability to provide computational gain for both short and long input sequences. In particular, the gain is in terms of both Floating Point Operations (FLOPs) and parameter count reduction. This is in contrast to the existing popular computational complexity reduction techniques, e.g., Linformer, Performer, and Reformer, for whom FLOPs overpower the efficient design tricks for the shorter input sequences. Moreover, parameter reduction remains unaccounted for in the existing methods.We perform an extensive evaluation for the tasks of spectral reconstruction from RGB images, spectral reconstruction from snapshot compressive imaging, motion deblurring, and image deraining by enhancing the best-performing models with our GLMHA. Our results show up to a 7.7 Giga FLOPs reduction with 370K fewer parameters required to closely retain the original performance of the best-performing models that employ CSA.
Related papers
- Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - Boosting Image Restoration via Priors from Pre-trained Models [54.83907596825985]
We learn an additional lightweight module called Pre-Train-Guided Refinement Module (PTG-RM) to refine restoration results of a target restoration network with OSF.
PTG-RM effectively enhances restoration performance of various models across different tasks, including low-light enhancement, deraining, deblurring, and denoising.
arXiv Detail & Related papers (2024-03-11T15:11:57Z) - HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved
Diffusion Models [38.74983301496911]
Hyperspectral image (HSI) restoration aims at recovering clean images from degraded observations.
Existing model-based methods have limitations in accurately modeling the complex image characteristics.
This paper proposes an unsupervised HSI restoration framework with pre-trained diffusion model (HIR-Diff)
arXiv Detail & Related papers (2024-02-24T17:15:05Z) - LIR: A Lightweight Baseline for Image Restoration [4.187190284830909]
The inherent characteristics of the Image Restoration task are often overlooked in many works.
We propose a Lightweight Baseline network for Image Restoration called LIR to efficiently restore the image and remove degradations.
Our LIR achieves the state-of-the-art Structure Similarity Index Measure (SSIM) and comparable performance to state-of-the-art models on Peak Signal-to-Noise Ratio (PSNR)
arXiv Detail & Related papers (2024-02-02T12:39:47Z) - Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts [52.39959535724677]
We introduce an alternative solution to improve the generalization of image restoration models.
We propose AdaptIR, a Mixture-of-Experts (MoE) with multi-branch design to capture local, global, and channel representation bases.
Our AdaptIR achieves stable performance on single-degradation tasks, and excels in hybrid-degradation tasks, with fine-tuning only 0.6% parameters for 8 hours.
arXiv Detail & Related papers (2023-12-12T14:27:59Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - GRAN: Ghost Residual Attention Network for Single Image Super Resolution [44.4178326950426]
This paper introduces Ghost Residual Attention Block (GRAB) groups to overcome the drawbacks of the standard convolutional operation.
Ghost Module can reveal information underlying intrinsic features by employing linear operations to replace the standard convolutions.
Experiments conducted on the benchmark datasets demonstrate the superior performance of our method in both qualitative and quantitative.
arXiv Detail & Related papers (2023-02-28T13:26:24Z) - Fourier Space Losses for Efficient Perceptual Image Super-Resolution [131.50099891772598]
We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions.
We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality.
The trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
arXiv Detail & Related papers (2021-06-01T20:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.