IDET: Iterative Difference-Enhanced Transformers for High-Quality Change
Detection
- URL: http://arxiv.org/abs/2207.09240v1
- Date: Fri, 15 Jul 2022 07:40:29 GMT
- Title: IDET: Iterative Difference-Enhanced Transformers for High-Quality Change
Detection
- Authors: Rui Huang, Ruofei Wang, Qing Guo, Yuxiang Zhang, Wei Fan
- Abstract summary: Change detection (CD) aims to detect change regions within an image pair captured at different times.
We study the CD from a new perspective, i.e., how to optimize the feature difference to highlight changes and suppress unchanged regions.
We propose a novel module denoted as iterative difference-enhanced transformers (IDET)
Our final CD method outperforms seven state-of-the-art methods on six large-scale datasets.
- Score: 16.507124958270694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Change detection (CD) aims to detect change regions within an image pair
captured at different times, playing a significant role for diverse real-world
applications. Nevertheless, most of existing works focus on designing advanced
network architectures to map the feature difference to the final change map
while ignoring the influence of the quality of the feature difference. In this
paper, we study the CD from a new perspective, i.e., how to optimize the
feature difference to highlight changes and suppress unchanged regions, and
propose a novel module denoted as iterative difference-enhanced transformers
(IDET). IDET contains three transformers: two transformers for extracting the
long-range information of the two images and one transformer for enhancing the
feature difference. In contrast to the previous transformers, the third
transformer takes the outputs of the first two transformers to guide the
enhancement of the feature difference iteratively. To achieve more effective
refinement, we further propose the multi-scale IDET-based change detection that
uses multi-scale representations of the images for multiple feature difference
refinements and proposes a coarse-to-fine fusion strategy to combine all
refinements. Our final CD method outperforms seven state-of-the-art methods on
six large-scale datasets under diverse application scenarios, which
demonstrates the importance of feature difference enhancements and the
effectiveness of IDET.
Related papers
- Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning [49.24306593078429]
We propose a novel framework for remote sensing image change captioning, guided by Key Change Features and Instruction-tuned (KCFI)
KCFI includes a ViTs encoder for extracting bi-temporal remote sensing image features, a key feature perceiver for identifying critical change areas, and a pixel-level change detection decoder.
To validate the effectiveness of our approach, we compare it against several state-of-the-art change captioning methods on the LEVIR-CC dataset.
arXiv Detail & Related papers (2024-09-19T09:33:33Z) - ChangeViT: Unleashing Plain Vision Transformers for Change Detection [3.582733645632794]
ChangeViT is a framework that adopts a plain ViT backbone to enhance the performance of large-scale changes.
The framework achieves state-of-the-art performance on three popular high-resolution datasets.
arXiv Detail & Related papers (2024-06-18T17:59:08Z) - TransY-Net:Learning Fully Transformer Networks for Change Detection of
Remote Sensing Images [64.63004710817239]
We propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD.
It improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.
Our proposed method achieves a new state-of-the-art performance on four optical and two SAR image CD benchmarks.
arXiv Detail & Related papers (2023-10-22T07:42:19Z) - Changes-Aware Transformer: Learning Generalized Changes Representation [56.917000244470174]
We propose a novel Changes-Aware Transformer (CAT) for refining difference features.
The generalized representation of various changes is learned straightforwardly in the difference feature space.
After refinement, the changed pixels in the difference feature space are closer to each other, which facilitates change detection.
arXiv Detail & Related papers (2023-09-24T12:21:57Z) - Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z) - Multi-manifold Attention for Vision Transformers [12.862540139118073]
Vision Transformers are very popular nowadays due to their state-of-the-art performance in several computer vision tasks.
A novel attention mechanism, called multi-manifold multihead attention, is proposed in this work to substitute the vanilla self-attention of a Transformer.
arXiv Detail & Related papers (2022-07-18T12:53:53Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - AdaViT: Adaptive Vision Transformers for Efficient Image Recognition [78.07924262215181]
We introduce AdaViT, an adaptive framework that learns to derive usage policies on which patches, self-attention heads and transformer blocks to use.
Our method obtains more than 2x improvement on efficiency compared to state-of-the-art vision transformers with only 0.8% drop of accuracy.
arXiv Detail & Related papers (2021-11-30T18:57:02Z) - DASNet: Dual attentive fully convolutional siamese networks for change
detection of high resolution satellite images [17.839181739760676]
The research objective is to identity the change information of interest and filter out the irrelevant change information as interference factors.
Recently, the rise of deep learning has provided new tools for change detection, which have yielded impressive results.
We propose a new method, namely, dual attentive fully convolutional Siamese networks (DASNet) for change detection in high-resolution images.
arXiv Detail & Related papers (2020-03-07T16:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.