Changes-Aware Transformer: Learning Generalized Changes Representation
- URL: http://arxiv.org/abs/2309.13619v1
- Date: Sun, 24 Sep 2023 12:21:57 GMT
- Title: Changes-Aware Transformer: Learning Generalized Changes Representation
- Authors: Dan Wang, Licheng Jiao, Jie Chen, Shuyuan Yang, Fang Liu
- Abstract summary: We propose a novel Changes-Aware Transformer (CAT) for refining difference features.
The generalized representation of various changes is learned straightforwardly in the difference feature space.
After refinement, the changed pixels in the difference feature space are closer to each other, which facilitates change detection.
- Score: 56.917000244470174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Difference features obtained by comparing the images of two periods play an
indispensable role in the change detection (CD) task. However, a pair of
bi-temporal images can exhibit diverse changes, which may cause various
difference features. Identifying changed pixels with differ difference features
to be the same category is thus a challenge for CD. Most nowadays' methods
acquire distinctive difference features in implicit ways like enhancing image
representation or supervision information. Nevertheless, informative image
features only guarantee object semantics are modeled and can not guarantee that
changed pixels have similar semantics in the difference feature space and are
distinct from those unchanged ones. In this work, the generalized
representation of various changes is learned straightforwardly in the
difference feature space, and a novel Changes-Aware Transformer (CAT) for
refining difference features is proposed. This generalized representation can
perceive which pixels are changed and which are unchanged and further guide the
update of pixels' difference features. CAT effectively accomplishes this
refinement process through the stacked cosine cross-attention layer and
self-attention layer. After refinement, the changed pixels in the difference
feature space are closer to each other, which facilitates change detection. In
addition, CAT is compatible with various backbone networks and existing CD
methods. Experiments on remote sensing CD data set and street scene CD data set
show that our method achieves state-of-the-art performance and has excellent
generalization.
Related papers
- Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning [49.24306593078429]
We propose a novel framework for remote sensing image change captioning, guided by Key Change Features and Instruction-tuned (KCFI)
KCFI includes a ViTs encoder for extracting bi-temporal remote sensing image features, a key feature perceiver for identifying critical change areas, and a pixel-level change detection decoder.
To validate the effectiveness of our approach, we compare it against several state-of-the-art change captioning methods on the LEVIR-CC dataset.
arXiv Detail & Related papers (2024-09-19T09:33:33Z) - Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - Context-aware Difference Distilling for Multi-change Captioning [106.72151597074098]
Multi-change captioning aims to describe complex and coupled changes within an image pair in natural language.
We propose a novel context-aware difference distilling network to capture all genuine changes for yielding sentences.
arXiv Detail & Related papers (2024-05-31T14:07:39Z) - Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning [28.3763053922823]
Methods for Remote Sensing Image Change Captioning (RSICC) perform well in simple scenes but exhibit poorer performance in complex scenes.
We believe pixel-level CD is significant for describing the differences between images through language.
Our method achieves state-of-the-art performance and validate that learning pixel-level CD pseudo-labels significantly contributes to change captioning.
arXiv Detail & Related papers (2023-12-23T17:58:48Z) - Align, Perturb and Decouple: Toward Better Leverage of Difference
Information for RSI Change Detection [24.249552791014644]
Change detection is a widely adopted technique in remote sense imagery (RSI) analysis.
We propose a series of operations to fully exploit the difference information: Alignment, Perturbation and Decoupling.
arXiv Detail & Related papers (2023-05-30T03:39:53Z) - Learning Transformations To Reduce the Geometric Shift in Object
Detection [60.20931827772482]
We tackle geometric shifts emerging from variations in the image capture process.
We introduce a self-training approach that learns a set of geometric transformations to minimize these shifts.
We evaluate our method on two different shifts, i.e., a camera's field of view (FoV) change and a viewpoint change.
arXiv Detail & Related papers (2023-01-13T11:55:30Z) - IDET: Iterative Difference-Enhanced Transformers for High-Quality Change
Detection [16.507124958270694]
Change detection (CD) aims to detect change regions within an image pair captured at different times.
We study the CD from a new perspective, i.e., how to optimize the feature difference to highlight changes and suppress unchanged regions.
We propose a novel module denoted as iterative difference-enhanced transformers (IDET)
Our final CD method outperforms seven state-of-the-art methods on six large-scale datasets.
arXiv Detail & Related papers (2022-07-15T07:40:29Z) - HIPA: Hierarchical Patch Transformer for Single Image Super Resolution [62.7081074931892]
This paper presents HIPA, a novel Transformer architecture that progressively recovers the high resolution image using a hierarchical patch partition.
We build a cascaded model that processes an input image in multiple stages, where we start with tokens with small patch sizes and gradually merge to the full resolution.
Such a hierarchical patch mechanism not only explicitly enables feature aggregation at multiple resolutions but also adaptively learns patch-aware features for different image regions.
arXiv Detail & Related papers (2022-03-19T05:09:34Z) - Two-Phase Object-Based Deep Learning for Multi-temporal SAR Image Change
Detection [23.2069257991734]
Change detection is one of the fundamental applications of synthetic aperture radar (SAR) images.
Speckle noise presented in SAR images has a much negative effect on change detection.
Two-phase object-based deep learning approach is proposed for multi-temporal SAR image change detection.
arXiv Detail & Related papers (2020-01-17T11:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.