Near-Infrared and Low-Rank Adaptation of Vision Transformers in Remote Sensing
- URL: http://arxiv.org/abs/2405.17901v1
- Date: Tue, 28 May 2024 07:24:07 GMT
- Title: Near-Infrared and Low-Rank Adaptation of Vision Transformers in Remote Sensing
- Authors: Irem Ulku, O. Ozgur Tanriover, Erdem Akagündüz,
- Abstract summary: Plant health can be monitored dynamically using multispectral sensors that measure Near-Infrared reflectance (NIR)
Despite this potential, obtaining and annotating high-resolution NIR images poses a significant challenge for training deep neural networks.
This study investigates the potential benefits of using vision transformer (ViT) backbones pre-trained in the RGB domain, with low-rank adaptation for downstream tasks in the NIR domain.
- Score: 3.2088888904556123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Plant health can be monitored dynamically using multispectral sensors that measure Near-Infrared reflectance (NIR). Despite this potential, obtaining and annotating high-resolution NIR images poses a significant challenge for training deep neural networks. Typically, large networks pre-trained on the RGB domain are utilized to fine-tune infrared images. This practice introduces a domain shift issue because of the differing visual traits between RGB and NIR images.As an alternative to fine-tuning, a method called low-rank adaptation (LoRA) enables more efficient training by optimizing rank-decomposition matrices while keeping the original network weights frozen. However, existing parameter-efficient adaptation strategies for remote sensing images focus on RGB images and overlook domain shift issues in the NIR domain. Therefore, this study investigates the potential benefits of using vision transformer (ViT) backbones pre-trained in the RGB domain, with low-rank adaptation for downstream tasks in the NIR domain. Extensive experiments demonstrate that employing LoRA with pre-trained ViT backbones yields the best performance for downstream tasks applied to NIR images.
Related papers
- Retinex-RAWMamba: Bridging Demosaicing and Denoising for Low-Light RAW Image Enhancement [71.13353154514418]
Low-light image enhancement, particularly in cross-domain tasks such as mapping from the raw domain to the sRGB domain, remains a significant challenge.
We present a novel Mamba scanning mechanism, called RAWMamba, to effectively handle raw images with different CFAs.
We also present a Retinex Decomposition Module (RDM) grounded in Retinex prior, which decouples illumination from reflectance to facilitate more effective denoising and automatic non-linear exposure correction.
arXiv Detail & Related papers (2024-09-11T06:12:03Z) - Towards RGB-NIR Cross-modality Image Registration and Beyond [21.475871648254564]
This paper focuses on the area of RGB(visible)-NIR(near-infrared) cross-modality image registration.
We first present the RGB-NIR Image Registration (RGB-NIR-IRegis) benchmark, which, for the first time, enables fair and comprehensive evaluations.
We then design several metrics to reveal the toxic impact of inconsistent local features between visible and infrared images on the model performance.
arXiv Detail & Related papers (2024-05-30T10:25:50Z) - NIR-Assisted Image Denoising: A Selective Fusion Approach and A Real-World Benchmark Dataset [53.79524776100983]
Leveraging near-infrared (NIR) images to assist visible RGB image denoising shows the potential to address this issue.
Existing works still struggle with taking advantage of NIR information effectively for real-world image denoising.
We propose an efficient Selective Fusion Module (SFM), which can be plug-and-played into the advanced denoising networks.
arXiv Detail & Related papers (2024-04-12T14:54:26Z) - Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB
Spectral Domain Translation [6.580484964018551]
We introduce a domain translation module that translates NIR source images into the grayscale target domain.
By incorporating a progressive training strategy, the statistical and semantic knowledge from both task domains are efficiently aligned.
Experiments show that our MPFNet outperforms state-of-the-art counterparts by 2.55 dB in the NIR-to-RGB spectral domain translation task.
arXiv Detail & Related papers (2023-12-26T13:07:45Z) - Tensor Factorization for Leveraging Cross-Modal Knowledge in
Data-Constrained Infrared Object Detection [22.60228799622782]
Key bottleneck in object detection in IR images is lack of sufficient labeled training data.
We seek to leverage cues from the RGB modality to scale object detectors to the IR modality, while preserving model performance in the RGB modality.
We first pretrain these factor matrices on the RGB modality, for which plenty of training data are assumed to exist and then augment only a few trainable parameters for training on the IR modality to avoid over-fitting.
arXiv Detail & Related papers (2023-09-28T16:55:52Z) - Point-aware Interaction and CNN-induced Refinement Network for RGB-D
Salient Object Detection [95.84616822805664]
We introduce CNNs-assisted Transformer architecture and propose a novel RGB-D SOD network with Point-aware Interaction and CNN-induced Refinement.
In order to alleviate the block effect and detail destruction problems brought by the Transformer naturally, we design a CNN-induced refinement (CNNR) unit for content refinement and supplementation.
arXiv Detail & Related papers (2023-08-17T11:57:49Z) - Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z) - Infrared Image Super-Resolution via Heterogeneous Convolutional WGAN [4.6667021835430145]
We present a framework that employs heterogeneous kernel-based super-resolution Wasserstein GAN (HetSRWGAN) for IR image super-resolution.
HetSRWGAN achieves consistently better performance in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2021-09-02T14:01:05Z) - TBNet:Two-Stream Boundary-aware Network for Generic Image Manipulation
Localization [49.521622399483846]
We propose a novel end-to-end two-stream boundary-aware network (abbreviated as TBNet) for generic image manipulation localization.
The proposed TBNet can significantly outperform state-of-the-art generic image manipulation localization methods in terms of both MCC and F1.
arXiv Detail & Related papers (2021-08-10T08:22:05Z) - Generation of the NIR spectral Band for Satellite Images with
Convolutional Neural Networks [0.0]
Deep neural networks allow generating artificial spectral information, such as for the image colorization problem.
We study the generative adversarial network (GAN) approach in the task of the NIR band generation using just RGB channels of high-resolution satellite imagery.
arXiv Detail & Related papers (2021-06-13T15:14:57Z) - MobileSal: Extremely Efficient RGB-D Salient Object Detection [62.04876251927581]
This paper introduces a novel network, methodname, which focuses on efficient RGB-D salient object detection (SOD)
We propose an implicit depth restoration (IDR) technique to strengthen the feature representation capability of mobile networks for RGB-D SOD.
With IDR and CPR incorporated, methodnameperforms favorably against sArt methods on seven challenging RGB-D SOD datasets.
arXiv Detail & Related papers (2020-12-24T04:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.