Related papers: UniRoute: Unified Routing Mixture-of-Experts for Modality-Adaptive Remote Sensing Change Detection

UniRoute: Unified Routing Mixture-of-Experts for Modality-Adaptive Remote Sensing Change Detection

URL: http://arxiv.org/abs/2601.14797v1
Date: Wed, 21 Jan 2026 09:21:25 GMT
Title: UniRoute: Unified Routing Mixture-of-Experts for Modality-Adaptive Remote Sensing Change Detection
Authors: Qingling Shu, Sibao Chen, Wei Lu, Zhihui You, Chengzhuang Liu,
Abstract summary: UniRoute is a unified framework for modality-adaptive learning.<n>We introduce an Adaptive Receptive Field Routing MoE module to disentangle local spatial details from global semantic context.<n>We also propose a Consistency-Aware Self-Distillation strategy that stabilizes unified training under data-scarce heterogeneous settings.
Score: 6.323154336421137
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current remote sensing change detection (CD) methods mainly rely on specialized models, which limits the scalability toward modality-adaptive Earth observation. For homogeneous CD, precise boundary delineation relies on fine-grained spatial cues and local pixel interactions, whereas heterogeneous CD instead requires broader contextual information to suppress speckle noise and geometric distortions. Moreover, difference operator (e.g., subtraction) works well for aligned homogeneous images but introduces artifacts in cross-modal or geometrically misaligned scenarios. Across different modality settings, specialized models based on static backbones or fixed difference operations often prove insufficient. To address this challenge, we propose UniRoute, a unified framework for modality-adaptive learning by reformulating feature extraction and fusion as conditional routing problems. We introduce an Adaptive Receptive Field Routing MoE (AR2-MoE) module to disentangle local spatial details from global semantic context, and a Modality-Aware Difference Routing MoE (MDR-MoE) module to adaptively select the most suitable fusion primitive at each pixel. In addition, we propose a Consistency-Aware Self-Distillation (CASD) strategy that stabilizes unified training under data-scarce heterogeneous settings by enforcing multi-level consistency. Extensive experiments on five public datasets demonstrate that UniRoute achieves strong overall performance, with a favorable accuracy-efficiency trade-off under a unified deployment setting.

Related papers

GRAD-Former: Gated Robust Attention-based Differential Transformer for Change Detection [0.7865560760233441]
Change detection (CD) in remote sensing aims to identify semantic differences between satellite images captured at different times.<n>Traditional transformer-based methods suffer from quadratic computational complexity when applied to very high-resolution (VHR) satellite images.<n>We present GRAD-Former, a novel framework that enhances contextual understanding while maintaining efficiency through reduced model size.
arXiv Detail & Related papers (2026-03-01T15:56:42Z)
AdaptOVCD: Training-Free Open-Vocabulary Remote Sensing Change Detection via Adaptive Information Fusion [17.998110109161683]
AdaptOVCD is a training-free Open-Vocabulary Change Detection architecture based on dual-dimensional multi-level information fusion.<n>The framework integrates multi-level information fusion across data, feature, and decision levels vertically while incorporating targeted adaptive designs horizontally.<n>It achieves 84.89% of the fully-supervised performance upper bound in cross-dataset evaluations and exhibits superior generalization capabilities.
arXiv Detail & Related papers (2026-02-06T09:30:23Z)
Beyond Weight Adaptation: Feature-Space Domain Injection for Cross-Modal Ship Re-Identification [3.6907522136316975]
Cross-Modality Ship Re-Identification (CMS Re-ID) is critical for achieving all-day and all-weather maritime target tracking.<n>We explore the potential of Vision Foundation Models (VFMs) in bridging modality gaps.<n>We propose a novel PEFT strategy termed Domain Representation Injection (DRI)
arXiv Detail & Related papers (2025-12-24T02:30:23Z)
Simulating Distribution Dynamics: Liquid Temporal Feature Evolution for Single-Domain Generalized Object Detection [58.25418970608328]
Single-Domain Generalized Object Detection (Single-DGOD) aims to transfer a detector trained on one source domain to multiple unknown domains.<n>Existing methods for Single-DGOD typically rely on discrete data augmentation or static perturbation methods to expand data diversity.<n>We propose a new method, which simulates the progressive evolution of features from the source domain to simulated latent distributions.
arXiv Detail & Related papers (2025-11-13T03:10:39Z)
UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation [104.59740403500132]
Multi-modal image segmentation faces real-world deployment challenges from incomplete/corrupted modalities degrading performance.<n>We propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC)<n>Our approach hierarchically bridges representation gaps between complete and incomplete modalities across input, feature and output levels.
arXiv Detail & Related papers (2025-09-19T17:29:25Z)
Graph-Based Uncertainty Modeling and Multimodal Fusion for Salient Object Detection [12.743278093269325]
We propose a dynamic uncertainty propagation and multimodal collaborative reasoning network (DUP-MCRNet)<n>DUGC is designed to propagate uncertainty between layers through a sparse graph constructed based on spatial semantic distance.<n>MCF uses learnable modality gating weights to weightedly fuse the attention maps of RGB, depth, and edge features.
arXiv Detail & Related papers (2025-08-28T04:31:48Z)
RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization [50.75654397516163]
We propose RelayFormer, a unified framework that adapts to varying resolutions and modalities.<n> RelayFormer partitions inputs into fixed-size sub-images and introduces Global-Local Relay (GLR) tokens.<n>This enables efficient exchange of global cues, such as semantic or temporal consistency, while preserving fine-grained manipulation artifacts.
arXiv Detail & Related papers (2025-08-13T03:35:28Z)
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts [29.52183168979229]
We propose SMoEStereo, a novel framework that adapts VFMs for stereo matching through a tailored, scene-specific fusion of Low-Rank Adaptation (LoRA) and Mixture-of-Experts (MoE) modules.<n>Our method exhibits state-of-the-art cross-domain and joint generalization across multiple benchmarks without dataset-specific adaptation.
arXiv Detail & Related papers (2025-07-07T03:19:04Z)
SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection [12.964308630328688]
Infrared small target detection (ISTD) is vital for long-range surveillance in military, maritime, and early warning applications.<n>ISTD is challenged by targets occupying less than 0.15% of the image and low distinguishability from complex backgrounds.<n>This paper presents SAMamba, a novel framework integrating SAM2's hierarchical feature learning with Mamba's selective sequence modeling.
arXiv Detail & Related papers (2025-05-29T07:55:23Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
Improving Anomaly Segmentation with Multi-Granularity Cross-Domain Alignment [17.086123737443714]
Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems. While existing methods demonstrate noteworthy results on synthetic data, they often fail to consider the disparity between synthetic and real-world data domains. We introduce the Multi-Granularity Cross-Domain Alignment framework, tailored to harmonize features across domains at both the scene and individual sample levels.
arXiv Detail & Related papers (2023-08-16T22:54:49Z)
Adaptive Spot-Guided Transformer for Consistent Local Feature Matching [64.30749838423922]
We propose Adaptive Spot-Guided Transformer (ASTR) for local feature matching. ASTR models the local consistency and scale variations in a unified coarse-to-fine architecture.
arXiv Detail & Related papers (2023-03-29T12:28:01Z)
Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains. We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.