Cross-Domain Object Detection with Mean-Teacher Transformer
- URL: http://arxiv.org/abs/2205.01643v1
- Date: Tue, 3 May 2022 17:11:55 GMT
- Title: Cross-Domain Object Detection with Mean-Teacher Transformer
- Authors: Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis
Gudovskiy, Tomoyuki Okuno, Jianxin Li, Kurt Keutzer, Shanghang Zhang
- Abstract summary: We propose an end-to-end cross-domain detection transformer based on the mean teacher knowledge transfer (MTKT)
We design three levels of source-target feature alignment strategies based on the architecture of the Transformer, including domain query-based feature alignment (DQFA), bi-level-graph-based prototype alignment (BGPA) and token-wise image feature alignment (TIFA)
Our proposed method achieves state-of-the-art performance on three domain adaptation scenarios, especially the result of Sim10k to Cityscapes scenario is remarkably improved from 52.6 mAP to 57.9 mAP.
- Score: 43.486392965014105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, DEtection TRansformer (DETR), an end-to-end object detection
pipeline, has achieved promising performance. However, it requires large-scale
labeled data and suffers from domain shift, especially when no labeled data is
available in the target domain. To solve this problem, we propose an end-to-end
cross-domain detection transformer based on the mean teacher knowledge transfer
(MTKT), which transfers knowledge between domains via pseudo labels. To improve
the quality of pseudo labels in the target domain, which is a crucial factor
for better domain adaptation, we design three levels of source-target feature
alignment strategies based on the architecture of the Transformer, including
domain query-based feature alignment (DQFA), bi-level-graph-based prototype
alignment (BGPA), and token-wise image feature alignment (TIFA). These three
levels of feature alignment match the global, local, and instance features
between source and target, respectively. With these strategies, more accurate
pseudo labels can be obtained, and knowledge can be better transferred from
source to target, thus improving the cross-domain capability of the detection
transformer. Extensive experiments demonstrate that our proposed method
achieves state-of-the-art performance on three domain adaptation scenarios,
especially the result of Sim10k to Cityscapes scenario is remarkably improved
from 52.6 mAP to 57.9 mAP. Code will be released.
Related papers
- DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment [7.768332621617199]
We introduce a strong DETR-based detector named Domain Adaptive detection TRansformer ( DATR) for unsupervised domain adaptation of object detection.
Our proposed DATR incorporates a mean-teacher based self-training framework, utilizing pseudo-labels generated by the teacher model to further mitigate domain bias.
Experiments demonstrate superior performance and generalization capabilities of our proposed DATR in multiple domain adaptation scenarios.
arXiv Detail & Related papers (2024-05-20T03:48:45Z) - Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
Semantic Segmentation [51.10389829070684]
Domain gap can cause discrepancies in self-attention.
Due to this gap, the transformer attends to spurious regions or pixels, which deteriorates accuracy on the target domain.
We propose adaptation on attention maps with cross-domain attention layers.
arXiv Detail & Related papers (2022-11-27T02:40:33Z) - CA-UDA: Class-Aware Unsupervised Domain Adaptation with Optimal
Assignment and Pseudo-Label Refinement [84.10513481953583]
unsupervised domain adaptation (UDA) focuses on the selection of good pseudo-labels as surrogates for the missing labels in the target data.
source domain bias that deteriorates the pseudo-labels can still exist since the shared network of the source and target domains are typically used for the pseudo-label selections.
We propose CA-UDA to improve the quality of the pseudo-labels and UDA results with optimal assignment, a pseudo-label refinement strategy and class-aware domain alignment.
arXiv Detail & Related papers (2022-05-26T18:45:04Z) - Improving Transferability for Domain Adaptive Detection Transformers [34.61314708197079]
This paper aims to build a simple but effective baseline with a DETR-style detector on domain shift settings.
For one, mitigating the domain shift on the backbone and the decoder output features excels in getting favorable results.
For another, advanced domain alignment methods in both parts further enhance the performance.
arXiv Detail & Related papers (2022-04-29T16:27:10Z) - CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [44.06904757181245]
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to a different unlabeled target domain.
One fundamental problem for the category level based UDA is the production of pseudo labels for samples in target domain.
We design a two-way center-aware labeling algorithm to produce pseudo labels for target samples.
Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment.
arXiv Detail & Related papers (2021-09-13T17:59:07Z) - Exploring Sequence Feature Alignment for Domain Adaptive Detection
Transformers [141.70707071815653]
We propose a novel Sequence Feature Alignment (SFA) method that is specially designed for the adaptation of detection transformers.
SFA consists of a domain query-based feature alignment (DQFA) module and a token-wise feature alignment (TDA) module.
Experiments on three challenging benchmarks show that SFA outperforms state-of-the-art domain adaptive object detection methods.
arXiv Detail & Related papers (2021-07-27T07:17:12Z) - Instance Level Affinity-Based Transfer for Unsupervised Domain
Adaptation [74.71931918541748]
We propose an instance affinity based criterion for source to target transfer during adaptation, called ILA-DA.
We first propose a reliable and efficient method to extract similar and dissimilar samples across source and target, and utilize a multi-sample contrastive loss to drive the domain alignment process.
We verify the effectiveness of ILA-DA by observing consistent improvements in accuracy over popular domain adaptation approaches on a variety of benchmark datasets.
arXiv Detail & Related papers (2021-04-03T01:33:14Z) - Disentanglement-based Cross-Domain Feature Augmentation for Effective
Unsupervised Domain Adaptive Person Re-identification [87.72851934197936]
Unsupervised domain adaptive (UDA) person re-identification (ReID) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain for person matching.
One challenge is how to generate target domain samples with reliable labels for training.
We propose a Disentanglement-based Cross-Domain Feature Augmentation strategy.
arXiv Detail & Related papers (2021-03-25T15:28:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.