HyRet-Change: A hybrid retentive network for remote sensing change detection
- URL: http://arxiv.org/abs/2506.12836v1
- Date: Sun, 15 Jun 2025 13:14:55 GMT
- Title: HyRet-Change: A hybrid retentive network for remote sensing change detection
- Authors: Mustansar Fiaz, Mubashir Noman, Hiyam Debary, Kamran Ali, Hisham Cholakkal,
- Abstract summary: We propose a Siamese-based framework, called HyRet-Change, which can seamlessly integrate the merits of convolution and retention mechanisms.<n>Specifically, we introduce a novel feature difference module to exploit both convolutions and multi-head retention mechanisms.<n>We perform experiments on three challenging CD datasets and achieve state-of-the-art performance compared to existing methods.
- Score: 14.46707519278272
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently convolution and transformer-based change detection (CD) methods provide promising performance. However, it remains unclear how the local and global dependencies interact to effectively alleviate the pseudo changes. Moreover, directly utilizing standard self-attention presents intrinsic limitations including governing global feature representations limit to capture subtle changes, quadratic complexity, and restricted training parallelism. To address these limitations, we propose a Siamese-based framework, called HyRet-Change, which can seamlessly integrate the merits of convolution and retention mechanisms at multi-scale features to preserve critical information and enhance adaptability in complex scenes. Specifically, we introduce a novel feature difference module to exploit both convolutions and multi-head retention mechanisms in a parallel manner to capture complementary information. Furthermore, we propose an adaptive local-global interactive context awareness mechanism that enables mutual learning and enhances discrimination capability through information exchange. We perform experiments on three challenging CD datasets and achieve state-of-the-art performance compared to existing methods. Our source code is publicly available at https://github.com/mustansarfiaz/HyRect-Change.
Related papers
- Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy [56.424032454461695]
We present Dita, a scalable framework that leverages Transformer architectures to directly denoise continuous action sequences.<n>Dita employs in-context conditioning -- enabling fine-grained alignment between denoised actions and raw visual tokens from historical observations.<n>Dita effectively integrates cross-embodiment datasets across diverse camera perspectives, observation scenes, tasks, and action spaces.
arXiv Detail & Related papers (2025-03-25T15:19:56Z) - Extended Cross-Modality United Learning for Unsupervised Visible-Infrared Person Re-identification [34.93081601924748]
Unsupervised learning aims to learn modality-invariant features from unlabeled cross-modality datasets.<n>Existing methods lack cross-modality clustering or excessively pursue cluster-level association.<n>We propose Extended Cross-Modality United Learning (ECUL) framework, incorporating Extended Modality-Camera Clustering (EMCC) and Two-Step Memory Updating Strategy (TSMem) modules.
arXiv Detail & Related papers (2024-12-26T09:30:26Z) - Online Relational Inference for Evolving Multi-agent Interacting Systems [14.275434303742328]
Online Inference (ORI) is designed to efficiently identify hidden interaction graphs in evolving multi-agent interacting systems.
Unlike traditional offline methods that rely on a fixed training set, ORI employs online backpropagation, updating the model with each new data point.
A key innovation is the use of an adjacency matrix as a trainable parameter, optimized through a new adaptive learning technique called AdaRelation.
arXiv Detail & Related papers (2024-11-03T05:43:55Z) - PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE enhances global feature representation of point cloud masked autoencoders by making them both discriminative and sensitive to transformations.<n>We propose a novel loss that explicitly penalizes invariant collapse, enabling the network to capture richer transformation cues while preserving discriminative representations.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - ChangeBind: A Hybrid Change Encoder for Remote Sensing Change Detection [16.62779899494721]
Change detection (CD) is a fundamental task in remote sensing (RS) which aims to detect the semantic changes between the same geographical regions at different time stamps.
We propose an effective Siamese-based framework to encode the semantic changes occurring in the bi-temporal RS images.
arXiv Detail & Related papers (2024-04-26T17:47:14Z) - ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection [65.59969454655996]
We propose an efficient change detection framework, ELGC-Net, which leverages rich contextual information to precisely estimate change regions.
Our proposed ELGC-Net sets a new state-of-the-art performance in remote sensing change detection benchmarks.
We also introduce ELGC-Net-LW, a lighter variant with significantly reduced computational complexity, suitable for resource-constrained settings.
arXiv Detail & Related papers (2024-03-26T17:46:25Z) - ICAFusion: Iterative Cross-Attention Guided Feature Fusion for
Multispectral Object Detection [25.66305300362193]
A novel feature fusion framework of dual cross-attention transformers is proposed to model global feature interaction.
This framework enhances the discriminability of object features through the query-guided cross-attention mechanism.
The proposed method achieves superior performance and faster inference, making it suitable for various practical scenarios.
arXiv Detail & Related papers (2023-08-15T00:02:10Z) - Surface EMG-Based Inter-Session/Inter-Subject Gesture Recognition by
Leveraging Lightweight All-ConvNet and Transfer Learning [17.535392299244066]
Gesture recognition using low-resolution instantaneous HD-sEMG images opens up new avenues for the development of more fluid and natural muscle-computer interfaces.
The data variability between inter-session and inter-subject scenarios presents a great challenge.
Existing approaches employed very large and complex deep ConvNet or 2SRNN-based domain adaptation methods to approximate the distribution shift caused by these inter-session and inter-subject data variability.
We propose a lightweight All-ConvNet+TL model that leverages lightweight All-ConvNet and transfer learning (TL) for the enhancement of inter-session and inter-subject gesture recognition
arXiv Detail & Related papers (2023-05-13T21:47:55Z) - Fast and Slow Learning of Recurrent Independent Mechanisms [80.38910637873066]
We propose a training framework in which the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks.
An attention mechanism dynamically selects which modules can be adapted to the current task.
We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup.
arXiv Detail & Related papers (2021-05-18T17:50:32Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.