Bridging Sensor Gaps via Attention Gated Tuning for Hyperspectral Image Classification
- URL: http://arxiv.org/abs/2309.12865v3
- Date: Thu, 25 Jul 2024 08:32:15 GMT
- Title: Bridging Sensor Gaps via Attention Gated Tuning for Hyperspectral Image Classification
- Authors: Xizhe Xue, Haokui Zhang, Zongwen Bai, Ying Li,
- Abstract summary: HSI classification methods require high-quality labeled HSIs, which are often costly to obtain.
We propose a novel Attention-Gated Tuning (AGT) strategy and a triplet-structured transformer model, Tri-Former, to address this issue.
- Score: 9.82907639745345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-hungry HSI classification methods require high-quality labeled HSIs, which are often costly to obtain. This characteristic limits the performance potential of data-driven methods when dealing with limited annotated samples. Bridging the domain gap between data acquired from different sensors allows us to utilize abundant labeled data across sensors to break this bottleneck. In this paper, we propose a novel Attention-Gated Tuning (AGT) strategy and a triplet-structured transformer model, Tri-Former, to address this issue. The AGT strategy serves as a bridge, allowing us to leverage existing labeled HSI datasets, even RGB datasets to enhance the performance on new HSI datasets with limited samples. Instead of inserting additional parameters inside the basic model, we train a lightweight auxiliary branch that takes intermediate features as input from the basic model and makes predictions. The proposed AGT resolves conflicts between heterogeneous and even cross-modal data by suppressing the disturbing information and enhances the useful information through a soft gate. Additionally, we introduce Tri-Former, a triplet-structured transformer with a spectral-spatial separation design that enhances parameter utilization and computational efficiency, enabling easier and flexible fine-tuning. Comparison experiments conducted on three representative HSI datasets captured by different sensors demonstrate the proposed Tri-Former achieves better performance compared to several state-of-the-art methods. Homologous, heterologous and cross-modal tuning experiments verified the effectiveness of the proposed AGT. Code has been released at: \href{https://github.com/Cecilia-xue/AGT}{https://github.com/Cecilia-xue/AGT}.
Related papers
- Data Augmentation for Traffic Classification [54.92823760790628]
Data Augmentation (DA) is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks.
DA has struggled to gain traction in networking contexts, particularly in Traffic Classification (TC) tasks.
arXiv Detail & Related papers (2024-01-19T15:25:09Z) - Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards
Boosted Few-Shot Parameter-Efficient Tuning [22.0296008705388]
We propose a framework called Hint-based Data Augmentation (Hint-Aug)
It aims to boost foundation vision transformers (FViTs) in few-shot tuning by augmenting the over-fitted parts of tuning samples with the learned features of pretrained FViTs.
Extensive experiments and ablation studies on five datasets and three parameter-efficient tuning techniques consistently validate Hint-Aug's effectiveness.
arXiv Detail & Related papers (2023-04-25T02:22:01Z) - ADS_UNet: A Nested UNet for Histopathology Image Segmentation [1.213915839836187]
We propose ADS UNet, a stage-wise additive training algorithm that incorporates resource-efficient deep supervision in shallower layers.
We demonstrate that ADS_UNet outperforms state-of-the-art Transformer-based models by 1.08 and 0.6 points on CRAG and BCSS datasets.
arXiv Detail & Related papers (2023-04-10T13:08:48Z) - Hybrid Spectral Denoising Transformer with Guided Attention [34.34075175179669]
We present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising.
Our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead.
arXiv Detail & Related papers (2023-03-16T02:24:31Z) - Towards Data-Efficient Detection Transformers [77.43470797296906]
We show most detection transformers suffer from significant performance drops on small-size datasets.
We empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR.
We introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
arXiv Detail & Related papers (2022-03-17T17:56:34Z) - Change Detection from Synthetic Aperture Radar Images via Graph-Based
Knowledge Supplement Network [36.41983596642354]
We propose a Graph-based Knowledge Supplement Network (GKSNet) for image change detection.
To be more specific, we extract discriminative information from the existing labeled dataset as additional knowledge.
To validate the proposed method, we conducted extensive experiments on four SAR datasets.
arXiv Detail & Related papers (2022-01-22T02:50:50Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.
We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z) - Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer
Learning [67.40866334083941]
We propose an end-to-end 3-D lightweight convolutional neural network (CNN) for limited samples-based HSI classification.
Compared with conventional 3-D-CNN models, the proposed 3-D-LWNet has a deeper network structure, less parameters, and lower computation cost.
Our model achieves competitive performance for HSI classification compared to several state-of-the-art methods.
arXiv Detail & Related papers (2020-12-07T03:44:35Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.