Semantic-CD: Remote Sensing Image Semantic Change Detection towards Open-vocabulary Setting
- URL: http://arxiv.org/abs/2501.06808v1
- Date: Sun, 12 Jan 2025 13:22:11 GMT
- Title: Semantic-CD: Remote Sensing Image Semantic Change Detection towards Open-vocabulary Setting
- Authors: Yongshuo Zhu, Lu Li, Keyan Chen, Chenyang Liu, Fugen Zhou, Zhenwei Shi,
- Abstract summary: Traditional change detection methods often face challenges in generalizing across semantic categories in practical scenarios.
We introduce a novel approach called Semantic-CD, specifically designed for semantic change detection in remote sensing images.
By utilizing CLIP's extensive vocabulary knowledge, our model enhances its ability to generalize across categories.
- Score: 19.663899648983417
- License:
- Abstract: Remote sensing image semantic change detection is a method used to analyze remote sensing images, aiming to identify areas of change as well as categorize these changes within images of the same location taken at different times. Traditional change detection methods often face challenges in generalizing across semantic categories in practical scenarios. To address this issue, we introduce a novel approach called Semantic-CD, specifically designed for semantic change detection in remote sensing images. This method incorporates the open vocabulary semantics from the vision-language foundation model, CLIP. By utilizing CLIP's extensive vocabulary knowledge, our model enhances its ability to generalize across categories and improves segmentation through fully decoupled multi-task learning, which includes both binary change detection and semantic change detection tasks. Semantic-CD consists of four main components: a bi-temporal CLIP visual encoder for extracting features from bi-temporal images, an open semantic prompter for creating semantic cost volume maps with open vocabulary, a binary change detection decoder for generating binary change detection masks, and a semantic change detection decoder for producing semantic labels. Experimental results on the SECOND dataset demonstrate that Semantic-CD achieves more accurate masks and reduces semantic classification errors, illustrating its effectiveness in applying semantic priors from vision-language foundation models to SCD tasks.
Related papers
- Detect Changes like Humans: Incorporating Semantic Priors for Improved Change Detection [41.80924135539708]
We propose a Semantic-Aware Change Detection network, namely SA-CDNet.
Inspired by the human visual paradigm, a novel dual-stream feature decoder is derived to distinguish changes.
We also design a single-temporal semantic pre-training strategy to enhance the semantic understanding of landscapes.
arXiv Detail & Related papers (2024-12-22T08:27:15Z) - Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning [49.24306593078429]
We propose a novel framework for remote sensing image change captioning, guided by Key Change Features and Instruction-tuned (KCFI)
KCFI includes a ViTs encoder for extracting bi-temporal remote sensing image features, a key feature perceiver for identifying critical change areas, and a pixel-level change detection decoder.
To validate the effectiveness of our approach, we compare it against several state-of-the-art change captioning methods on the LEVIR-CC dataset.
arXiv Detail & Related papers (2024-09-19T09:33:33Z) - Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance [19.663899648983417]
We introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance.
We validate the proposed method on the LEVIR-CC and LEVIR-CD datasets.
arXiv Detail & Related papers (2024-07-19T05:07:41Z) - Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification [29.15203530375882]
Change (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature.
We propose MaskCD to detect changed areas by adaptively generating categorized masks from input image pairs.
It reconstructs the desired changed objects by decoding the pixel-wise representations into learnable mask proposals.
arXiv Detail & Related papers (2024-04-18T11:05:15Z) - Beyond One-to-One: Rethinking the Referring Image Segmentation [117.53010476628029]
Referring image segmentation aims to segment the target object referred by a natural language expression.
We propose a Dual Multi-Modal Interaction (DMMI) Network, which contains two decoder branches.
In the text-to-image decoder, text embedding is utilized to query the visual feature and localize the corresponding target.
Meanwhile, the image-to-text decoder is implemented to reconstruct the erased entity-phrase conditioned on the visual feature.
arXiv Detail & Related papers (2023-08-26T11:39:22Z) - Self-Pair: Synthesizing Changes from Single Source for Object Change
Detection in Remote Sensing Imagery [6.586756080460231]
We train a change detector using two spatially unrelated images with corresponding semantic labels such as building.
We show that manipulating the source image as an after-image is crucial to the performance of change detection.
Our method outperforms existing methods based on single-temporal supervision.
arXiv Detail & Related papers (2022-12-20T13:26:42Z) - Joint Spatio-Temporal Modeling for the Semantic Change Detection in
Remote Sensing Images [22.72105435238235]
We propose a Semantic Change (SCanFormer) to explicitly model the 'from-to' semantic transitions between the bi-temporal RSIss.
Then, we introduce a semantic learning scheme to leverage the Transformer-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes.
The resulting network (SCanNet) outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bi-temporal results.
arXiv Detail & Related papers (2022-12-10T08:49:19Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z) - A Weakly Supervised Convolutional Network for Change Segmentation and
Classification [91.3755431537592]
We present W-CDNet, a novel weakly supervised change detection network that can be trained with image-level semantic labels.
W-CDNet can be trained with two different types of datasets, either containing changed image pairs only or a mixture of changed and unchanged image pairs.
arXiv Detail & Related papers (2020-11-06T20:20:45Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.