Dual-Tasks Siamese Transformer Framework for Building Damage Assessment
- URL: http://arxiv.org/abs/2201.10953v1
- Date: Wed, 26 Jan 2022 14:11:16 GMT
- Title: Dual-Tasks Siamese Transformer Framework for Building Damage Assessment
- Authors: Hongruixuan Chen, Edoardo Nemni, Sofia Vallecorsa, Xi Li, Chen Wu,
Lars Bromley
- Abstract summary: We present the first attempt at designing a Transformer-based damage assessment architecture (DamFormer)
To the best of our knowledge, it is the first time that such a deep Transformer-based network is proposed for multitemporal remote sensing interpretation tasks.
- Score: 11.888964682446879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and fine-grained information about the extent of damage to buildings
is essential for humanitarian relief and disaster response. However, as the
most commonly used architecture in remote sensing interpretation tasks,
Convolutional Neural Networks (CNNs) have limited ability to model the
non-local relationship between pixels. Recently, Transformer architecture first
proposed for modeling long-range dependency in natural language processing has
shown promising results in computer vision tasks. Considering the frontier
advances of Transformer architecture in the computer vision field, in this
paper, we present the first attempt at designing a Transformer-based damage
assessment architecture (DamFormer). In DamFormer, a siamese Transformer
encoder is first constructed to extract non-local and representative deep
features from input multitemporal image-pairs. Then, a multitemporal fusion
module is designed to fuse information for downstream tasks. Finally, a
lightweight dual-tasks decoder aggregates multi-level features for final
prediction. To the best of our knowledge, it is the first time that such a deep
Transformer-based network is proposed for multitemporal remote sensing
interpretation tasks. The experimental results on the large-scale damage
assessment dataset xBD demonstrate the potential of the Transformer-based
architecture.
Related papers
- SiamixFormer: a fully-transformer Siamese network with temporal Fusion
for accurate building detection and change detection in bi-temporal remote
sensing images [0.0]
Building detection and change detection using remote sensing images can help urban and rescue planning.
Currently, most of the existing models for building detection use only one image (pre-disaster image) to detect buildings.
In this paper, we propose a siamese model, called SiamixFormer, which uses pre- and post-disaster images as input.
arXiv Detail & Related papers (2022-08-01T07:35:45Z) - Defect Transformer: An Efficient Hybrid Transformer Architecture for
Surface Defect Detection [2.0999222360659604]
We propose an efficient hybrid transformer architecture, termed Defect Transformer (DefT), for surface defect detection.
DefT incorporates CNN and transformer into a unified model to capture local and non-local relationships collaboratively.
Experiments on three datasets demonstrate the superiority and efficiency of our method compared with other CNN- and transformer-based networks.
arXiv Detail & Related papers (2022-07-17T23:37:48Z) - Swin-Pose: Swin Transformer Based Human Pose Estimation [16.247836509380026]
Convolutional neural networks (CNNs) have been widely utilized in many computer vision tasks.
CNNs have a fixed reception field and lack the ability of long-range perception, which is crucial to human pose estimation.
We propose a novel model based on transformer architecture, enhanced with a feature pyramid fusion structure.
arXiv Detail & Related papers (2022-01-19T02:15:26Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - PE-former: Pose Estimation Transformer [0.0]
We investigate the use of a pure transformer architecture for the problem of 2D body pose estimation.
We demonstrate that using an encoder-decoder transformer architecture yields state of the art results on this estimation problem.
arXiv Detail & Related papers (2021-12-09T15:20:23Z) - ViDT: An Efficient and Effective Fully Transformer-based Object Detector [97.71746903042968]
Detection transformers are the first fully end-to-end learning systems for object detection.
vision transformers are the first fully transformer-based architecture for image classification.
In this paper, we integrate Vision and Detection Transformers (ViDT) to build an effective and efficient object detector.
arXiv Detail & Related papers (2021-10-08T06:32:05Z) - Multi-Exit Vision Transformer for Dynamic Inference [88.17413955380262]
We propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones.
We show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed.
arXiv Detail & Related papers (2021-06-29T09:01:13Z) - P2T: Pyramid Pooling Transformer for Scene Understanding [62.41912463252468]
We build a downstream-task-oriented transformer network, dubbed Pyramid Pooling Transformer (P2T)
Plugged with our pooling-based MHSA, we build a downstream-task-oriented transformer network, dubbed Pyramid Pooling Transformer (P2T)
arXiv Detail & Related papers (2021-06-22T18:28:52Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - Transformers in Vision: A Survey [101.07348618962111]
Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence.
Transformers require minimal inductive biases for their design and are naturally suited as set-functions.
This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline.
arXiv Detail & Related papers (2021-01-04T18:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.