Related papers: Graph Aggregation Prototype Learning for Semantic Change Detection in Remote Sensing

Graph Aggregation Prototype Learning for Semantic Change Detection in Remote Sensing

URL: http://arxiv.org/abs/2507.10938v1
Date: Tue, 15 Jul 2025 03:03:29 GMT
Title: Graph Aggregation Prototype Learning for Semantic Change Detection in Remote Sensing
Authors: Zhengyi Xu, Haoran Wu, Wen Jiang, Jie Geng,
Abstract summary: We propose graph aggregation prototype learning for semantic change detection in remote sensing.<n>Our method achieves state-of-the-art performance, with significant improvements in accuracy and robustness for SCD task.
Score: 11.262559117458304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic change detection (SCD) extends the binary change detection task to provide not only the change locations but also the detailed "from-to" categories in multi-temporal remote sensing data. Such detailed semantic insights into changes offer considerable advantages for a wide array of applications. However, since SCD involves the simultaneous optimization of multiple tasks, the model is prone to negative transfer due to task-specific learning difficulties and conflicting gradient flows. To address this issue, we propose Graph Aggregation Prototype Learning for Semantic Change Detection in remote sensing(GAPL-SCD). In this framework, a multi-task joint optimization method is designed to optimize the primary task of semantic segmentation and change detection, along with the auxiliary task of graph aggregation prototype learning. Adaptive weight allocation and gradient rotation methods are used to alleviate the conflict between training tasks and improve multi-task learning capabilities. Specifically, the graph aggregation prototype learning module constructs an interaction graph using high-level features. Prototypes serve as class proxies, enabling category-level domain alignment across time points and reducing interference from irrelevant changes. Additionally, the proposed self-query multi-level feature interaction and bi-temporal feature fusion modules further enhance multi-scale feature representation, improving performance in complex scenes. Experimental results on the SECOND and Landsat-SCD datasets demonstrate that our method achieves state-of-the-art performance, with significant improvements in accuracy and robustness for SCD task.

Related papers

DINO-CoDT: Multi-class Collaborative Detection and Tracking with Vision Foundation Models [11.34839442803445]
We propose a multi-class collaborative detection and tracking framework tailored for diverse road users.<n>We first present a detector with a global spatial attention fusion (GSAF) module, enhancing multi-scale feature learning for objects of varying sizes.<n>Next, we introduce a tracklet RE-IDentification (REID) module that leverages visual semantics with a vision foundation model to effectively reduce ID SWitch (IDSW) errors.
arXiv Detail & Related papers (2025-06-09T02:49:10Z)
SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective [0.6749750044497732]
We develop a fine-tuning strategy called the Semantic Change Network (SCN) to address the data scarcity issue.<n>We observe that the locations of changes between the two images are spatially identical, a concept we refer to as spatial consistency.<n>This enhances the modeling of multi-scale changes and helps capture underlying relationships in change detection semantics.
arXiv Detail & Related papers (2025-03-26T17:15:43Z)
Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks.<n>MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization.<n>MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z)
HANet: A Hierarchical Attention Network for Change Detection With Bitemporal Very-High-Resolution Remote Sensing Images [6.890268321645873]
We propose a progressive foreground-balanced sampling strategy on the basis of not adding change information. This strategy helps the model accurately learn the features of the changed pixels during the early training process. We also design a discriminative Siamese network, hierarchical attention network (HANet), which can integrate multiscale features and refine detailed features.
arXiv Detail & Related papers (2024-04-14T08:01:27Z)
Remote Sensing Image Change Detection with Graph Interaction [1.8579693774597708]
We propose a bitemporal image graph Interaction network for remote sensing change detection, namely BGINet-CD. Our model demonstrates superior performance compared to other state-of-the-art methods (SOTA) on the GZ CD dataset.
arXiv Detail & Related papers (2023-07-05T03:32:49Z)
Task-Aware Asynchronous Multi-Task Model with Class Incremental Contrastive Learning for Surgical Scene Understanding [17.80234074699157]
A multi-task learning model is proposed for surgical report generation and tool-tissue interaction prediction. The model forms of shared feature extractor, mesh-transformer branch for captioning and graph attention branch for tool-tissue interaction prediction. We incorporate a task-aware asynchronous MTL optimization technique to fine-tune the shared weights and converge both tasks optimally.
arXiv Detail & Related papers (2022-11-28T14:08:48Z)
Multi-task Over-the-Air Federated Learning: A Non-Orthogonal Transmission Approach [52.85647632037537]
We propose a multi-task over-theair federated learning (MOAFL) framework, where multiple learning tasks share edge devices for data collection and learning models under the coordination of a edge server (ES) Both the convergence analysis and numerical results demonstrate that the MOAFL framework can significantly reduce the uplink bandwidth consumption of multiple tasks without causing substantial learning performance degradation.
arXiv Detail & Related papers (2021-06-27T13:09:32Z)
Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting. We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions. Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z)
One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module. We also propose novel training strategies that effectively improve detection performance. Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection [86.69077525494106]
Unsupervised domain adaptation (UDA) has achieved unprecedented success in improving the cross-domain robustness of object detection models. Existing UDA methods largely ignore the instantaneous data distribution during model learning, which could deteriorate the feature representation given large domain shift. We propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains.
arXiv Detail & Related papers (2020-03-19T13:30:45Z)
Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario. We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.