Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning
- URL: http://arxiv.org/abs/2312.15311v2
- Date: Tue, 21 May 2024 13:28:09 GMT
- Title: Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning
- Authors: Chenyang Liu, Keyan Chen, Zipeng Qi, Haotian Zhang, Zhengxia Zou, Zhenwei Shi,
- Abstract summary: Methods for Remote Sensing Image Change Captioning (RSICC) perform well in simple scenes but exhibit poorer performance in complex scenes.
We believe pixel-level CD is significant for describing the differences between images through language.
Our method achieves state-of-the-art performance and validate that learning pixel-level CD pseudo-labels significantly contributes to change captioning.
- Score: 28.3763053922823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The existing methods for Remote Sensing Image Change Captioning (RSICC) perform well in simple scenes but exhibit poorer performance in complex scenes. This limitation is primarily attributed to the model's constrained visual ability to distinguish and locate changes. Acknowledging the inherent correlation between change detection (CD) and RSICC tasks, we believe pixel-level CD is significant for describing the differences between images through language. Regrettably, the current RSICC dataset lacks readily available pixel-level CD labels. To address this deficiency, we leverage a model trained on existing CD datasets to derive CD pseudo-labels. We propose an innovative network with an auxiliary CD branch, supervised by pseudo-labels. Furthermore, a semantic fusion augment (SFA) module is proposed to fuse the feature information extracted by the CD branch, thereby facilitating the nuanced description of changes. Experiments demonstrate that our method achieves state-of-the-art performance and validate that learning pixel-level CD pseudo-labels significantly contributes to change captioning. Our code will be available at: https://github.com/Chen-Yang-Liu/Pix4Cap
Related papers
- Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning [49.24306593078429]
We propose a novel framework for remote sensing image change captioning, guided by Key Change Features and Instruction-tuned (KCFI)
KCFI includes a ViTs encoder for extracting bi-temporal remote sensing image features, a key feature perceiver for identifying critical change areas, and a pixel-level change detection decoder.
To validate the effectiveness of our approach, we compare it against several state-of-the-art change captioning methods on the LEVIR-CC dataset.
arXiv Detail & Related papers (2024-09-19T09:33:33Z) - Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance [19.663899648983417]
We introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance.
We validate the proposed method on the LEVIR-CC and LEVIR-CD datasets.
arXiv Detail & Related papers (2024-07-19T05:07:41Z) - MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification [29.15203530375882]
Change (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature.
We propose MaskCD to detect changed areas by adaptively generating categorized masks from input image pairs.
It reconstructs the desired changed objects by decoding the pixel-wise representations into learnable mask proposals.
arXiv Detail & Related papers (2024-04-18T11:05:15Z) - TransY-Net:Learning Fully Transformer Networks for Change Detection of
Remote Sensing Images [64.63004710817239]
We propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD.
It improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.
Our proposed method achieves a new state-of-the-art performance on four optical and two SAR image CD benchmarks.
arXiv Detail & Related papers (2023-10-22T07:42:19Z) - Changes-Aware Transformer: Learning Generalized Changes Representation [56.917000244470174]
We propose a novel Changes-Aware Transformer (CAT) for refining difference features.
The generalized representation of various changes is learned straightforwardly in the difference feature space.
After refinement, the changed pixels in the difference feature space are closer to each other, which facilitates change detection.
arXiv Detail & Related papers (2023-09-24T12:21:57Z) - Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection [9.229278131265124]
Weakly-supervised change detection (WSCD) aims to detect pixel-level changes with only image-level annotations.
We propose two components: a Dilated Prior (DP) decoder and a Label Gated (LG) constraint.
Our proposed TransWCD and TransWCD-DL achieve significant +6.33% and +9.55% F1 score improvements over the state-of-the-art methods on the WHU-CD dataset.
arXiv Detail & Related papers (2023-07-20T13:16:10Z) - Distilling Self-Supervised Vision Transformers for Weakly-Supervised
Few-Shot Classification & Segmentation [58.03255076119459]
We address the task of weakly-supervised few-shot image classification and segmentation, by leveraging a Vision Transformer (ViT)
Our proposed method takes token representations from the self-supervised ViT and leverages their correlations, via self-attention, to produce classification and segmentation predictions.
Experiments on Pascal-5i and COCO-20i demonstrate significant performance gains in a variety of supervision settings.
arXiv Detail & Related papers (2023-07-07T06:16:43Z) - Revisiting Consistency Regularization for Semi-supervised Change
Detection in Remote Sensing Images [60.89777029184023]
We propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss.
Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD.
arXiv Detail & Related papers (2022-04-18T17:59:01Z) - A Weakly Supervised Convolutional Network for Change Segmentation and
Classification [91.3755431537592]
We present W-CDNet, a novel weakly supervised change detection network that can be trained with image-level semantic labels.
W-CDNet can be trained with two different types of datasets, either containing changed image pairs only or a mixture of changed and unchanged image pairs.
arXiv Detail & Related papers (2020-11-06T20:20:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.