CSD: Change Semantic Detection with only Semantic Change Masks for Damage Assessment in Conflict Zones
- URL: http://arxiv.org/abs/2511.19035v1
- Date: Mon, 24 Nov 2025 12:16:21 GMT
- Title: CSD: Change Semantic Detection with only Semantic Change Masks for Damage Assessment in Conflict Zones
- Authors: Kai Zhenga, Zhenkai Wu, Fupeng Wei, Miaolan Zhou, Kai Lie, Haitao Guo, Lei Ding, Wei Zhang, Hang-Cheng Dong,
- Abstract summary: We introduce a pre-trained DINOv3 model and propose a multi-scale cross-attention difference siamese network (MC-DiSNet)<n>The powerful visual representation capability of the DINOv3 backbone enables robust and rich feature extraction from bi-temporal remote sensing images.<n>Unlike conventional semantic change detection (SCD), our approach eliminates the need for large-scale semantic annotations of bi-temporal images.
- Score: 3.640978477877182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurately and swiftly assessing damage from conflicts is crucial for humanitarian aid and regional stability. In conflict zones, damaged zones often share similar architectural styles, with damage typically covering small areas and exhibiting blurred boundaries. These characteristics lead to limited data, annotation difficulties, and significant recognition challenges, including high intra-class similarity and ambiguous semantic changes. To address these issues, we introduce a pre-trained DINOv3 model and propose a multi-scale cross-attention difference siamese network (MC-DiSNet). The powerful visual representation capability of the DINOv3 backbone enables robust and rich feature extraction from bi-temporal remote sensing images. We also release a new Gaza-change dataset containing high-resolution satellite image pairs from 2023-2024 with pixel-level semantic change annotations. It is worth emphasizing that our annotations only include semantic pixels of changed areas. Unlike conventional semantic change detection (SCD), our approach eliminates the need for large-scale semantic annotations of bi-temporal images, instead focusing directly on the changed regions. We term this new task change semantic detection (CSD). The CSD task represents a direct extension of binary change detection (BCD). Due to the limited spatial extent of semantic regions, it presents greater challenges than traditional SCD tasks. We evaluated our method under the CSD framework on both the Gaza-Change and SECOND datasets. Experimental results demonstrate that our proposed approach effectively addresses the CSD task, and its outstanding performance paves the way for practical applications in rapid damage assessment across conflict zones.
Related papers
- Open-Vocabulary Domain Generalization in Urban-Scene Segmentation [83.15573353963235]
Domain Generalization in Semantic Domain (DG-SS) aims to enable segmentation models to perform robustly in unseen environments.<n>Recent progress in Vision-Language Models (VLMs) has advanced Open-Vocabulary Semantic (OV-SS) by enabling models to recognize a broader range of concepts.<n>Yet, these models remain sensitive to domain shifts and struggle to maintain robustness when deployed in unseen environments.<n>We propose S2-Corr, a state-space-driven text-image correlation refinement mechanism that produces more consistent text-image correlations under distribution changes.
arXiv Detail & Related papers (2026-02-21T14:32:27Z) - Referring Change Detection in Remote Sensing Imagery [49.841833753558575]
We introduce Referring Change Detection (RCD), which leverages natural language prompts to detect specific classes of changes in remote sensing images.<n>We propose a two-stage framework consisting of (I) textbfRCDNet, a cross-modal fusion network designed for referring change detection, and (II) textbfRCDGen, a diffusion-based synthetic data generation pipeline.
arXiv Detail & Related papers (2025-12-12T16:57:12Z) - FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection [48.06921153684768]
We present a new benchmark for remote sensing semantic change detection (SCD) called LevirSCD.<n>The dataset covers 16 change categories and 210 specific change types, with more fine-grained class definitions.<n>We propose a foreground-background co-guided SCD (FoBa) method, which leverages foregrounds enriched with contextual information to guide the model.<n>FoBa achieves competitive results compared to current SOTA methods, with improvements of 1.48%, 3.61%, and 2.81% in the SeK metric, respectively.
arXiv Detail & Related papers (2025-09-19T09:19:57Z) - BEVUDA++: Geometric-aware Unsupervised Domain Adaptation for Multi-View 3D Object Detection [56.477525075806966]
Vision-centric Bird's Eye View (BEV) perception holds considerable promise for autonomous driving.<n>Recent studies have prioritized efficiency or accuracy enhancements, yet the issue of domain shift has been overlooked.<n>We introduce an innovative geometric-aware teacher-student framework, BEVUDA++, to diminish this issue.
arXiv Detail & Related papers (2025-09-17T16:31:40Z) - CEBSNet: Change-Excited and Background-Suppressed Network with Temporal Dependency Modeling for Bitemporal Change Detection [5.667475728935794]
Change detection is a critical task in remote sensing and computer vision.<n>Current methods overlook temporal dependencies and overemphasize prominent changes.<n>We introduce textbfCEBSNet, a novel change-excited and background-suppressed network for change detection.
arXiv Detail & Related papers (2025-05-21T09:57:30Z) - Towards Robust and Realistic Human Pose Estimation via WiFi Signals [85.60557095666934]
WiFi-based human pose estimation is a challenging task that bridges discrete and subtle WiFi signals to human skeletons.<n>This paper revisits this problem and reveals two critical yet overlooked issues: 1) cross-domain gap, i.e., due to significant variations between source-target domain pose distributions; and 2) structural fidelity gap, i.e., predicted skeletal poses manifest distorted topology.<n>This paper fills these gaps by reformulating the task into a novel two-phase framework dubbed DT-Pose: Domain-consistent representation learning and Topology-constrained Pose decoding.
arXiv Detail & Related papers (2025-01-16T09:38:22Z) - ChangeBind: A Hybrid Change Encoder for Remote Sensing Change Detection [16.62779899494721]
Change detection (CD) is a fundamental task in remote sensing (RS) which aims to detect the semantic changes between the same geographical regions at different time stamps.
We propose an effective Siamese-based framework to encode the semantic changes occurring in the bi-temporal RS images.
arXiv Detail & Related papers (2024-04-26T17:47:14Z) - Hard Region Aware Network for Remote Sensing Change Detection [44.269913858088614]
Change detection (CD) is essential for various real-world applications, such as urban management and disaster assessment.
This paper proposes a novel change detection network, termed as HRANet, which provides accurate change maps via hard region mining.
arXiv Detail & Related papers (2023-05-31T02:52:38Z) - Hierarchical Disentanglement-Alignment Network for Robust SAR Vehicle
Recognition [18.38295403066007]
HDANet integrates feature disentanglement and alignment into a unified framework.
The proposed method demonstrates impressive robustness across nine operating conditions in the MSTAR dataset.
arXiv Detail & Related papers (2023-04-07T09:11:29Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Domain Adaptive Semantic Segmentation with Regional Contrastive
Consistency Regularization [19.279884432843822]
We propose a novel and fully end-to-end trainable approach, called regional contrastive consistency regularization (RCCR) for domain adaptive semantic segmentation.
Our core idea is to pull the similar regional features extracted from the same location of different images to be closer, and meanwhile push the features from the different locations of the two images to be separated.
arXiv Detail & Related papers (2021-10-11T11:45:00Z) - Semantics-Guided Contrastive Network for Zero-Shot Object detection [67.61512036994458]
Zero-shot object detection (ZSD) is a new challenge in computer vision.
We develop ContrastZSD, a framework that brings contrastive learning mechanism into the realm of zero-shot detection.
Our method outperforms the previous state-of-the-art on both ZSD and generalized ZSD tasks.
arXiv Detail & Related papers (2021-09-04T03:32:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.