Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification
- URL: http://arxiv.org/abs/2503.14938v1
- Date: Wed, 19 Mar 2025 07:04:24 GMT
- Title: Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification
- Authors: Zhong Ji, Ci Liu, Jingren Liu, Chen Tang, Yanwei Pang, Xuelong Li,
- Abstract summary: Few-Shot Remote Sensing Scene Classification (FS-RSSC) presents the challenge of classifying remote sensing images with limited labeled samples.<n>We propose a novel Optimal Transport Adapter Tuning (OTAT) framework aimed at constructing an ideal Platonic representational space.
- Score: 80.83325513157637
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-Shot Remote Sensing Scene Classification (FS-RSSC) presents the challenge of classifying remote sensing images with limited labeled samples. Existing methods typically emphasize single-modal feature learning, neglecting the potential benefits of optimizing multi-modal representations. To address this limitation, we propose a novel Optimal Transport Adapter Tuning (OTAT) framework aimed at constructing an ideal Platonic representational space through optimal transport (OT) theory. This framework seeks to harmonize rich visual information with less dense textual cues, enabling effective cross-modal information transfer and complementarity. Central to this approach is the Optimal Transport Adapter (OTA), which employs a cross-modal attention mechanism to enrich textual representations and facilitate subsequent better information interaction. By transforming the network optimization into an OT optimization problem, OTA establishes efficient pathways for balanced information exchange between modalities. Moreover, we introduce a sample-level Entropy-Aware Weighted (EAW) loss, which combines difficulty-weighted similarity scores with entropy-based regularization. This loss function provides finer control over the OT optimization process, enhancing its solvability and stability. Our framework offers a scalable and efficient solution for advancing multimodal learning in remote sensing applications. Extensive experiments on benchmark datasets demonstrate that OTAT achieves state-of-the-art performance in FS-RSSC, significantly improving the model performance and generalization.
Related papers
- Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models [13.742950928229078]
Low-Rank Adaptation (LoRA) addresses these issues by training compact, low-rank matrices instead of fully fine-tuning large models.
This paper introduces a wireless federated LoRA fine-tuning framework that optimize both learning performance and communication efficiency.
arXiv Detail & Related papers (2025-05-01T06:15:38Z) - Efficient Federated Split Learning for Large Language Models over Communication Networks [14.461758448289908]
Fine-tuning pre-trained large language models (LLM) in a distributed manner poses significant challenges on resource-constrained edge devices.
We propose FedsLLM, a novel framework that integrates split federated learning with parameter-efficient fine-tuning techniques.
arXiv Detail & Related papers (2025-04-20T16:16:54Z) - Continual Optimization with Symmetry Teleportation for Multi-Task Learning [73.28772872740744]
Multi-task learning (MTL) enables the simultaneous learning of multiple tasks using a single model.
We propose a novel approach based on Continual Optimization with Symmetry Teleportation (COST)
COST seeks an alternative loss-equivalent point on the loss landscape to reduce conflict gradients.
arXiv Detail & Related papers (2025-03-06T02:58:09Z) - Joint Optimal Transport and Embedding for Network Alignment [66.49765320358361]
We propose a joint optimal transport and embedding framework for network alignment named JOENA.<n>With a unified objective, the mutual benefits of both methods can be achieved by an alternating optimization schema with guaranteed convergence.<n>Experiments on real-world networks validate the effectiveness and scalability of JOENA, achieving up to 16% improvement in MRR and 20x speedup.
arXiv Detail & Related papers (2025-02-26T17:28:08Z) - Towards Explainable Evolution Strategies with Large Language Models [0.0]
This paper introduces an approach that integrates self-adaptive Evolution Strategies (ES) with Large Language Models (LLMs)
By employing a self-adaptive ES equipped with a restart mechanism, we effectively navigate the challenging landscapes of benchmark functions.
An LLM is then utilized to process these logs, generating concise, user-friendly summaries.
arXiv Detail & Related papers (2024-07-11T09:28:27Z) - PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference [44.77064952091458]
PRANCE is a Vision Transformer compression framework that jointly optimize the activated channels and reduces tokens, based on the characteristics of inputs.
We introduce a novel "Result-to-Go" training mechanism that models ViTs' inference process as a sequential decision process.
Our framework is shown to be compatible with various token optimization techniques such as pruning, merging, and pruning-merging strategies.
arXiv Detail & Related papers (2024-07-06T09:04:27Z) - Learning to Rebalance Multi-Modal Optimization by Adaptively Masking Subnetworks [13.065212096469537]
We propose a novel importance sampling-based, element-wise joint optimization method, called Adaptively Mask Subnetworks Considering Modal Significance(AMSS)
Specifically, we incorporate mutual information rates to determine the modal significance and employ non-uniform adaptive sampling to select foregroundworks from each modality for parameter updates.
Building upon theoretical insights, we further enhance the multi-modal mask subnetwork strategy using unbiased estimation, referred to as AMSS+.
arXiv Detail & Related papers (2024-04-12T09:22:24Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Sample-Driven Federated Learning for Energy-Efficient and Real-Time IoT
Sensing [22.968661040226756]
We introduce an online reinforcement learning algorithm named Sample-driven Control for Federated Learning (SCFL) built on the Soft Actor-Critic (A2C) framework.
SCFL enables the agent to dynamically adapt and find the global optima even in changing environments.
arXiv Detail & Related papers (2023-10-11T13:50:28Z) - A Meta-Learning Based Precoder Optimization Framework for Rate-Splitting
Multiple Access [53.191806757701215]
We propose the use of a meta-learning based precoder optimization framework to directly optimize the Rate-Splitting Multiple Access (RSMA) precoders with partial Channel State Information at the Transmitter (CSIT)
By exploiting the overfitting of the compact neural network to maximize the explicit Average Sum-Rate (ASR) expression, we effectively bypass the need for any other training data while minimizing the total running time.
Numerical results reveal that the meta-learning based solution achieves similar ASR performance to conventional precoder optimization in medium-scale scenarios, and significantly outperforms sub-optimal low complexity precoder algorithms in the large-scale
arXiv Detail & Related papers (2023-07-17T20:31:41Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.