SpectralX: Parameter-efficient Domain Generalization for Spectral Remote Sensing Foundation Models
- URL: http://arxiv.org/abs/2508.01731v1
- Date: Sun, 03 Aug 2025 12:14:38 GMT
- Title: SpectralX: Parameter-efficient Domain Generalization for Spectral Remote Sensing Foundation Models
- Authors: Yuxiang Zhang, Wei Li, Mengmeng Zhang, Jiawei Han, Ran Tao, Shunlin Liang,
- Abstract summary: SpectralX is an innovative parameter-efficient fine-tuning framework that adapt existing RSFMs as backbone.<n>We develop an Attribute-oriented Mixture of Adapter (AoMoA) that dynamically aggregates multi-attribute expert knowledge.<n>By iteratively querying low-level semantic features with high-level representations, the model learns to focus on task-beneficial attributes.
- Score: 27.717648142854696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in Remote Sensing Foundation Models (RSFMs) have led to significant breakthroughs in the field. While many RSFMs have been pretrained with massive optical imagery, more multispectral/hyperspectral data remain lack of the corresponding foundation models. To leverage the advantages of spectral imagery in earth observation, we explore whether existing RSFMs can be effectively adapted to process diverse spectral modalities without requiring extensive spectral pretraining. In response to this challenge, we proposed SpectralX, an innovative parameter-efficient fine-tuning framework that adapt existing RSFMs as backbone while introducing a two-stage training approach to handle various spectral inputs, thereby significantly improving domain generalization performance. In the first stage, we employ a masked-reconstruction task and design a specialized Hyper Tokenizer (HyperT) to extract attribute tokens from both spatial and spectral dimensions. Simultaneously, we develop an Attribute-oriented Mixture of Adapter (AoMoA) that dynamically aggregates multi-attribute expert knowledge while performing layer-wise fine-tuning. With semantic segmentation as downstream task in the second stage, we insert an Attribute-refined Adapter (Are-adapter) into the first stage framework. By iteratively querying low-level semantic features with high-level representations, the model learns to focus on task-beneficial attributes, enabling customized adjustment of RSFMs. Following this two-phase adaptation process, SpectralX is capable of interpreting spectral imagery from new regions or seasons. The codes will be available from the website: https://github.com/YuxiangZhang-BIT.
Related papers
- SpectrumFM: A New Paradigm for Spectrum Cognition [65.65474629224558]
We propose a spectrum foundation model, termed SpectrumFM, which provides a new paradigm for spectrum cognition.<n>An innovative spectrum encoder that exploits the convolutional neural networks is proposed to effectively capture both fine-grained local signal structures and high-level global dependencies in the spectrum data.<n>Two novel self-supervised learning tasks, namely masked reconstruction and next-slot signal prediction, are developed for pre-training SpectrumFM, enabling the model to learn rich and transferable representations.
arXiv Detail & Related papers (2025-08-02T14:40:50Z) - AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection [58.67129770371016]
We propose a novel IRSTD framework that reimagines the IRSTD paradigm by incorporating textual metadata for scene-aware optimization.<n>AuxDet consistently outperforms state-of-the-art methods, validating the critical role of auxiliary information in improving robustness and accuracy.
arXiv Detail & Related papers (2025-05-21T07:02:05Z) - Dual-Domain Masked Image Modeling: A Self-Supervised Pretraining Strategy Using Spatial and Frequency Domain Masking for Hyperspectral Data [35.34526230299484]
We propose a self-supervised pretraining strategy for hyperspectral data that utilizes the large portion of unlabeled data.<n>Our method introduces a novel dual-domain masking mechanism that operates in both spatial and frequency domains.<n>We evaluate our approach on three publicly available HSI classification benchmarks and demonstrate that it achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-05-06T06:24:21Z) - SpectrumFM: A Foundation Model for Intelligent Spectrum Management [99.08036558911242]
Existing intelligent spectrum management methods, typically based on small-scale models, suffer from notable limitations in recognition accuracy, convergence speed, and generalization.<n>This paper proposes a novel spectrum foundation model, termed SpectrumFM, establishing a new paradigm for spectrum management.<n>Experiments demonstrate that SpectrumFM achieves superior performance in terms of accuracy, robustness, adaptability, few-shot learning efficiency, and convergence speed.
arXiv Detail & Related papers (2025-05-02T04:06:39Z) - Spectral-Adaptive Modulation Networks for Visual Perception [9.912286808419205]
We use graph spectral analysis to theoretically simulate and compare the frequency responses of 2D convolution and self-attention.<n>Our results corroborate previous empirical findings and reveal that node connectivity, modulated by window size, is a key factor in shaping spectral functions.<n>Based on SPAM, we develop SPANetV2 as a novel vision backbone.
arXiv Detail & Related papers (2025-03-31T10:53:42Z) - SpectralGPT: Spectral Remote Sensing Foundation Model [60.023956954916414]
A universal RS foundation model, named SpectralGPT, is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT)
Compared to existing foundation models, SpectralGPT accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS big data.
Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS big data applications within the field of geoscience.
arXiv Detail & Related papers (2023-11-13T07:09:30Z) - DiffSpectralNet : Unveiling the Potential of Diffusion Models for
Hyperspectral Image Classification [6.521187080027966]
We propose a new network called DiffSpectralNet, which combines diffusion and transformer techniques.
First, we use an unsupervised learning framework based on the diffusion model to extract both high-level and low-level spectral-spatial features.
The diffusion method is capable of extracting diverse and meaningful spectral-spatial features, leading to improvement in HSI classification.
arXiv Detail & Related papers (2023-10-29T15:26:37Z) - Dynamic Spectrum Mixer for Visual Recognition [17.180863898764194]
We propose a content-adaptive yet computationally efficient structure, dubbed Dynamic Spectrum Mixer (DSM)
DSM represents token interactions in the frequency domain by employing the Cosine Transform.
It can learn long-term spatial dependencies with log-linear complexity.
arXiv Detail & Related papers (2023-09-13T04:51:15Z) - MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral
Reconstruction [148.26195175240923]
We propose a novel Transformer-based method, Multi-stage Spectral-wise Transformer (MST++) for efficient spectral reconstruction.
In the NTIRE 2022 Spectral Reconstruction Challenge, our approach won the First place.
arXiv Detail & Related papers (2022-04-17T02:39:32Z) - Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image
Reconstruction [127.20208645280438]
Hyperspectral image (HSI) reconstruction aims to recover the 3D spatial-spectral signal from a 2D measurement.
Modeling the inter-spectra interactions is beneficial for HSI reconstruction.
Mask-guided Spectral-wise Transformer (MST) proposes a novel framework for HSI reconstruction.
arXiv Detail & Related papers (2021-11-15T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.