Composite Active Learning: Towards Multi-Domain Active Learning with
Theoretical Guarantees
- URL: http://arxiv.org/abs/2402.02110v1
- Date: Sat, 3 Feb 2024 10:22:18 GMT
- Title: Composite Active Learning: Towards Multi-Domain Active Learning with
Theoretical Guarantees
- Authors: Guang-Yuan Hao, Hengguan Huang, Haotian Wang, Jie Gao, Hao Wang
- Abstract summary: Active learning (AL) aims to improve model performance within a fixed labeling budget by choosing the most informative data points to label.
We propose the first general method, dubbed composite active learning (CAL), for multi-domain AL.
Our theoretical analysis shows that our method achieves a better error bound compared to current AL methods.
- Score: 12.316113075760743
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Active learning (AL) aims to improve model performance within a fixed
labeling budget by choosing the most informative data points to label. Existing
AL focuses on the single-domain setting, where all data come from the same
domain (e.g., the same dataset). However, many real-world tasks often involve
multiple domains. For example, in visual recognition, it is often desirable to
train an image classifier that works across different environments (e.g.,
different backgrounds), where images from each environment constitute one
domain. Such a multi-domain AL setting is challenging for prior methods because
they (1) ignore the similarity among different domains when assigning labeling
budget and (2) fail to handle distribution shift of data across different
domains. In this paper, we propose the first general method, dubbed composite
active learning (CAL), for multi-domain AL. Our approach explicitly considers
the domain-level and instance-level information in the problem; CAL first
assigns domain-level budgets according to domain-level importance, which is
estimated by optimizing an upper error bound that we develop; with the
domain-level budgets, CAL then leverages a certain instance-level query
strategy to select samples to label from each domain. Our theoretical analysis
shows that our method achieves a better error bound compared to current AL
methods. Our empirical results demonstrate that our approach significantly
outperforms the state-of-the-art AL methods on both synthetic and real-world
multi-domain datasets. Code is available at
https://github.com/Wang-ML-Lab/multi-domain-active-learning.
Related papers
- Multi-Domain Long-Tailed Learning by Augmenting Disentangled
Representations [80.76164484820818]
There is an inescapable long-tailed class-imbalance issue in many real-world classification problems.
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another.
arXiv Detail & Related papers (2022-10-25T21:54:26Z) - MultiMatch: Multi-task Learning for Semi-supervised Domain Generalization [55.06956781674986]
We resort to solving the semi-supervised domain generalization task, where there are a few label information in each source domain.
We propose MultiMatch, extending FixMatch to the multi-task learning framework, producing the high-quality pseudo-label for SSDG.
A series of experiments validate the effectiveness of the proposed method, and it outperforms the existing semi-supervised methods and the SSDG method on several benchmark DG datasets.
arXiv Detail & Related papers (2022-08-11T14:44:33Z) - Aligning Domain-specific Distribution and Classifier for Cross-domain
Classification from Multiple Sources [25.204055330850164]
We propose a new framework with two alignment stages for Unsupervised Domain Adaptation.
Our method can achieve remarkable results on popular benchmark datasets for image classification.
arXiv Detail & Related papers (2022-01-04T06:35:11Z) - Adaptive Methods for Aggregated Domain Generalization [26.215904177457997]
In many settings, privacy concerns prohibit obtaining domain labels for the training data samples.
We propose a domain-adaptive approach to this problem, which operates in two steps.
Our approach achieves state-of-the-art performance on a variety of domain generalization benchmarks without using domain labels.
arXiv Detail & Related papers (2021-12-09T08:57:01Z) - Multi-Level Features Contrastive Networks for Unsupervised Domain
Adaptation [6.934905764152813]
Unsupervised domain adaptation aims to train a model from the labeled source domain to make predictions on the unlabeled target domain.
Existing methods tend to align the two domains directly at the domain-level, or perform class-level domain alignment based on deep feature.
In this paper, we develop this work on the method of class-level alignment.
arXiv Detail & Related papers (2021-09-14T09:23:27Z) - Cross-domain Contrastive Learning for Unsupervised Domain Adaptation [108.63914324182984]
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain.
We build upon contrastive self-supervised learning to align features so as to reduce the domain discrepancy between training and testing sets.
arXiv Detail & Related papers (2021-06-10T06:32:30Z) - Prototypical Cross-domain Self-supervised Learning for Few-shot
Unsupervised Domain Adaptation [91.58443042554903]
We propose an end-to-end Prototypical Cross-domain Self-Supervised Learning (PCS) framework for Few-shot Unsupervised Domain Adaptation (FUDA)
PCS not only performs cross-domain low-level feature alignment, but it also encodes and aligns semantic structures in the shared embedding space across domains.
Compared with state-of-the-art methods, PCS improves the mean classification accuracy over different domain pairs on FUDA by 10.5%, 3.5%, 9.0%, and 13.2% on Office, Office-Home, VisDA-2017, and DomainNet, respectively.
arXiv Detail & Related papers (2021-03-31T02:07:42Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation [65.38975706997088]
Open set domain adaptation (OSDA) assumes the presence of unknown classes in the target domain.
We show that existing state-of-the-art methods suffer a considerable performance drop in the presence of larger domain gaps.
We propose a novel framework to specifically address the larger domain gaps.
arXiv Detail & Related papers (2020-03-08T14:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.