Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
Semantic Segmentation
- URL: http://arxiv.org/abs/2211.14703v2
- Date: Tue, 29 Nov 2022 01:55:17 GMT
- Title: Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
Semantic Segmentation
- Authors: Kaihong Wang and Donghyun Kim and Rogerio Feris and Kate Saenko and
Margrit Betke
- Abstract summary: Domain gap can cause discrepancies in self-attention.
Due to this gap, the transformer attends to spurious regions or pixels, which deteriorates accuracy on the target domain.
We propose adaptation on attention maps with cross-domain attention layers.
- Score: 51.10389829070684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While transformers have greatly boosted performance in semantic segmentation,
domain adaptive transformers are not yet well explored. We identify that the
domain gap can cause discrepancies in self-attention. Due to this gap, the
transformer attends to spurious regions or pixels, which deteriorates accuracy
on the target domain. We propose to perform adaptation on attention maps with
cross-domain attention layers that share features between the source and the
target domains. Specifically, we impose consistency between predictions from
cross-domain attention and self-attention modules to encourage similar
distribution in the attention and output of the model across domains, i.e.,
attention-level and output-level alignment. We also enforce consistency in
attention maps between different augmented views to further strengthen the
attention-based alignment. Combining these two components, our method mitigates
the discrepancy in attention maps across domains and further boosts the
performance of the transformer under unsupervised domain adaptation settings.
Our model outperforms the existing state-of-the-art baseline model on three
widely used benchmarks, including GTAV-to-Cityscapes by 1.3 percent point (pp),
Synthia-to-Cityscapes by 0.6 pp, and Cityscapes-to-ACDC by 1.1 pp, on average.
Additionally, we verify the effectiveness and generalizability of our method
through extensive experiments. Our code will be publicly available.
Related papers
- Fine-Grained Unsupervised Cross-Modality Domain Adaptation for
Vestibular Schwannoma Segmentation [3.0081059328558624]
We focus on introducing a fine-grained unsupervised framework for domain adaptation.
We propose to use a vector to control the generator to synthesize a fake image with given features.
And then, we can apply various augmentations to the dataset by searching the feature dictionary.
arXiv Detail & Related papers (2023-11-25T18:08:59Z) - PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain
Adaptative Semantic Segmentation [100.6343963798169]
Unsupervised Domain Adaptation (UDA) aims to enhance the generalization of the learned model to other domains.
We propose a unified pixel- and patch-wise self-supervised learning framework, called PiPa, for domain adaptive semantic segmentation.
arXiv Detail & Related papers (2022-11-14T18:31:24Z) - Domain Adaptation for Object Detection using SE Adaptors and Center Loss [0.0]
We introduce an unsupervised domain adaptation method built on the foundation of faster-RCNN to prevent drops in performance due to domain shift.
We also introduce a family of adaptation layers that leverage the squeeze excitation mechanism called SE Adaptors to improve domain attention.
Finally, we incorporate a center loss in the instance and image level representations to improve the intra-class variance.
arXiv Detail & Related papers (2022-05-25T17:18:31Z) - Cross-Domain Object Detection with Mean-Teacher Transformer [43.486392965014105]
We propose an end-to-end cross-domain detection transformer based on the mean teacher knowledge transfer (MTKT)
We design three levels of source-target feature alignment strategies based on the architecture of the Transformer, including domain query-based feature alignment (DQFA), bi-level-graph-based prototype alignment (BGPA) and token-wise image feature alignment (TIFA)
Our proposed method achieves state-of-the-art performance on three domain adaptation scenarios, especially the result of Sim10k to Cityscapes scenario is remarkably improved from 52.6 mAP to 57.9 mAP.
arXiv Detail & Related papers (2022-05-03T17:11:55Z) - Towards Unsupervised Domain Adaptation via Domain-Transformer [0.0]
We propose the Domain-Transformer (DoT) for Unsupervised Domain Adaptation (UDA)
DoT integrates the CNN-backbones and the core attention mechanism of Transformers from a new perspective.
It achieves the local semantic consistency across domains, where the domain-level attention and manifold regularization are explored.
arXiv Detail & Related papers (2022-02-24T02:30:15Z) - Domain Adaptive Semantic Segmentation with Regional Contrastive
Consistency Regularization [19.279884432843822]
We propose a novel and fully end-to-end trainable approach, called regional contrastive consistency regularization (RCCR) for domain adaptive semantic segmentation.
Our core idea is to pull the similar regional features extracted from the same location of different images to be closer, and meanwhile push the features from the different locations of the two images to be separated.
arXiv Detail & Related papers (2021-10-11T11:45:00Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains.
Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z) - Channel-wise Alignment for Adaptive Object Detection [66.76486843397267]
Generic object detection has been immensely promoted by the development of deep convolutional neural networks.
Existing methods on this task usually draw attention on the high-level alignment based on the whole image or object of interest.
In this paper, we realize adaptation from a thoroughly different perspective, i.e., channel-wise alignment.
arXiv Detail & Related papers (2020-09-07T02:42:18Z) - Bi-Directional Generation for Unsupervised Domain Adaptation [61.73001005378002]
Unsupervised domain adaptation facilitates the unlabeled target domain relying on well-established source domain information.
Conventional methods forcefully reducing the domain discrepancy in the latent space will result in the destruction of intrinsic data structure.
We propose a Bi-Directional Generation domain adaptation model with consistent classifiers interpolating two intermediate domains to bridge source and target domains.
arXiv Detail & Related papers (2020-02-12T09:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.