Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher
- URL: http://arxiv.org/abs/2407.07780v1
- Date: Wed, 10 Jul 2024 15:56:24 GMT
- Title: Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher
- Authors: Jiangming Chen, Li Liu, Wanxia Deng, Zhen Liu, Yu Liu, Yingmei Wei, Yongxiang Liu,
- Abstract summary: Cross domain object detection learns an object detector for an unlabeled target domain by transferring knowledge from an annotated source domain.
In this study, we find that confidence misalignment of the predictions, including category-level overconfidence, instance-level task confidence inconsistency, and image-level confidence misfocusing, will bring suboptimal performance on the target domain.
- Score: 14.715398100791559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross domain object detection learns an object detector for an unlabeled target domain by transferring knowledge from an annotated source domain. Promising results have been achieved via Mean Teacher, however, pseudo labeling which is the bottleneck of mutual learning remains to be further explored. In this study, we find that confidence misalignment of the predictions, including category-level overconfidence, instance-level task confidence inconsistency, and image-level confidence misfocusing, leading to the injection of noisy pseudo label in the training process, will bring suboptimal performance on the target domain. To tackle this issue, we present a novel general framework termed Multi-Granularity Confidence Alignment Mean Teacher (MGCAMT) for cross domain object detection, which alleviates confidence misalignment across category-, instance-, and image-levels simultaneously to obtain high quality pseudo supervision for better teacher-student learning. Specifically, to align confidence with accuracy at category level, we propose Classification Confidence Alignment (CCA) to model category uncertainty based on Evidential Deep Learning (EDL) and filter out the category incorrect labels via an uncertainty-aware selection strategy. Furthermore, to mitigate the instance-level misalignment between classification and localization, we design Task Confidence Alignment (TCA) to enhance the interaction between the two task branches and allow each classification feature to adaptively locate the optimal feature for the regression. Finally, we develop imagery Focusing Confidence Alignment (FCA) adopting another way of pseudo label learning, i.e., we use the original outputs from the Mean Teacher network for supervised learning without label assignment to concentrate on holistic information in the target image. These three procedures benefit from each other from a cooperative learning perspective.
Related papers
- Domain Adaptive Object Detection via Balancing Between Self-Training and
Adversarial Learning [19.81071116581342]
Deep learning based object detectors struggle generalizing to a new target domain bearing significant variations in object and background.
Current methods align domains by using image or instance-level adversarial feature alignment.
We propose to leverage model's predictive uncertainty to strike the right balance between adversarial feature alignment and class-level alignment.
arXiv Detail & Related papers (2023-11-08T16:40:53Z) - Bi-discriminator Domain Adversarial Neural Networks with Class-Level
Gradient Alignment [87.8301166955305]
We propose a novel bi-discriminator domain adversarial neural network with class-level gradient alignment.
BACG resorts to gradient signals and second-order probability estimation for better alignment of domain distributions.
In addition, inspired by contrastive learning, we develop a memory bank-based variant, i.e. Fast-BACG, which can greatly shorten the training process.
arXiv Detail & Related papers (2023-10-21T09:53:17Z) - Contrastive Mean Teacher for Domain Adaptive Object Detectors [20.06919799819326]
Mean-teacher self-training is a powerful paradigm in unsupervised domain adaptation for object detection, but it struggles with low-quality pseudo-labels.
We propose Contrastive Mean Teacher (CMT) -- a unified, general-purpose framework with the two paradigms naturally integrated to maximize beneficial learning signals.
CMT leads to new state-of-the-art target-domain performance: 51.9% mAP on Foggy Cityscapes, outperforming the previously best by 2.1% mAP.
arXiv Detail & Related papers (2023-05-04T17:55:17Z) - Predicting Class Distribution Shift for Reliable Domain Adaptive Object
Detection [2.5193191501662144]
Unsupervised Domain Adaptive Object Detection (UDA-OD) uses unlabelled data to improve the reliability of robotic vision systems in open-world environments.
Previous approaches to UDA-OD based on self-training have been effective in overcoming changes in the general appearance of images.
We propose a framework for explicitly addressing class distribution shift to improve pseudo-label reliability in self-training.
arXiv Detail & Related papers (2023-02-13T00:46:34Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Boosting Unsupervised Domain Adaptation with Soft Pseudo-label and
Curriculum Learning [19.903568227077763]
Unsupervised domain adaptation (UDA) improves classification performance on an unlabeled target domain by leveraging data from a fully labeled source domain.
We propose a model-agnostic two-stage learning framework, which greatly reduces flawed model predictions using soft pseudo-label strategy.
At the second stage, we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains.
arXiv Detail & Related papers (2021-12-03T14:47:32Z) - UDA-COPE: Unsupervised Domain Adaptation for Category-level Object Pose
Estimation [84.16372642822495]
We propose an unsupervised domain adaptation (UDA) for category-level object pose estimation, called textbfUDA-COPE.
Inspired by the recent multi-modal UDA techniques, the proposed method exploits a teacher-student self-supervised learning scheme to train a pose estimation network without using target domain labels.
arXiv Detail & Related papers (2021-11-24T16:00:48Z) - Synergizing between Self-Training and Adversarial Learning for Domain
Adaptive Object Detection [11.091890625685298]
We study adapting trained object detectors to unseen domains manifesting significant variations of object appearance, viewpoints and backgrounds.
We propose to leverage model predictive uncertainty to strike the right balance between adversarial feature alignment and class-level alignment.
arXiv Detail & Related papers (2021-10-01T08:10:00Z) - Instance Level Affinity-Based Transfer for Unsupervised Domain
Adaptation [74.71931918541748]
We propose an instance affinity based criterion for source to target transfer during adaptation, called ILA-DA.
We first propose a reliable and efficient method to extract similar and dissimilar samples across source and target, and utilize a multi-sample contrastive loss to drive the domain alignment process.
We verify the effectiveness of ILA-DA by observing consistent improvements in accuracy over popular domain adaptation approaches on a variety of benchmark datasets.
arXiv Detail & Related papers (2021-04-03T01:33:14Z) - Your Classifier can Secretly Suffice Multi-Source Domain Adaptation [72.47706604261992]
Multi-Source Domain Adaptation (MSDA) deals with the transfer of task knowledge from multiple labeled source domains to an unlabeled target domain.
We present a different perspective to MSDA wherein deep models are observed to implicitly align the domains under label supervision.
arXiv Detail & Related papers (2021-03-20T12:44:13Z) - Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive
Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.