Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from
Mixture-of-Experts
- URL: http://arxiv.org/abs/2210.03885v1
- Date: Sat, 8 Oct 2022 02:28:10 GMT
- Title: Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from
Mixture-of-Experts
- Authors: Tao Zhong, Zhixiang Chi, Li Gu, Yang Wang, Yuanhao Yu, Jin Tang
- Abstract summary: Most existing methods perform training on multiple source domains using a single model.
We propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process.
- Score: 33.21435044949033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we tackle the problem of domain shift. Most existing methods
perform training on multiple source domains using a single model, and the same
trained model is used on all unseen target domains. Such solutions are
sub-optimal as each target domain exhibits its own speciality, which is not
adapted. Furthermore, expecting the single-model training to learn extensive
knowledge from the multiple source domains is counterintuitive. The model is
more biased toward learning only domain-invariant features and may result in
negative knowledge transfer. In this work, we propose a novel framework for
unsupervised test-time adaptation, which is formulated as a knowledge
distillation process to address domain shift. Specifically, we incorporate
Mixture-of-Experts (MoE) as teachers, where each expert is separately trained
on different source domains to maximize their speciality. Given a test-time
target domain, a small set of unlabeled data is sampled to query the knowledge
from MoE. As the source domains are correlated to the target domains, a
transformer-based aggregator then combines the domain knowledge by examining
the interconnection among them. The output is treated as a supervision signal
to adapt a student prediction network toward the target domain. We further
employ meta-learning to enforce the aggregator to distill positive knowledge
and the student network to achieve fast adaptation. Extensive experiments
demonstrate that the proposed method outperforms the state-of-the-art and
validates the effectiveness of each proposed component. Our code is available
at https://github.com/n3il666/Meta-DMoE.
Related papers
- Adapting to Distribution Shift by Visual Domain Prompt Generation [34.19066857066073]
We adapt a model at test-time using a few unlabeled data to address distribution shifts.
We build a knowledge bank to learn the transferable knowledge from source domains.
The proposed method outperforms previous work on 5 large-scale benchmarks including WILDS and DomainNet.
arXiv Detail & Related papers (2024-05-05T02:44:04Z) - Revisiting the Domain Shift and Sample Uncertainty in Multi-source
Active Domain Transfer [69.82229895838577]
Active Domain Adaptation (ADA) aims to maximally boost model adaptation in a new target domain by actively selecting a limited number of target data to annotate.
This setting neglects the more practical scenario where training data are collected from multiple sources.
This motivates us to target a new and challenging setting of knowledge transfer that extends ADA from a single source domain to multiple source domains.
arXiv Detail & Related papers (2023-11-21T13:12:21Z) - Meta-causal Learning for Single Domain Generalization [102.53303707563612]
Single domain generalization aims to learn a model from a single training domain (source domain) and apply it to multiple unseen test domains (target domains)
Existing methods focus on expanding the distribution of the training domain to cover the target domains, but without estimating the domain shift between the source and target domains.
We propose a new learning paradigm, namely simulate-analyze-reduce, which first simulates the domain shift by building an auxiliary domain as the target domain, then learns to analyze the causes of domain shift, and finally learns to reduce the domain shift for model adaptation.
arXiv Detail & Related papers (2023-04-07T15:46:38Z) - Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation [73.3083304858763]
This paper systematically studies the impact of mixup under the domain adaptaive semantic segmentation task.
In specific, we achieve domain mixup in two-step: cut and paste.
We provide extensive ablation experiments to empirically verify our main components of the framework.
arXiv Detail & Related papers (2023-03-17T05:22:44Z) - Few-shot Image Generation via Adaptation-Aware Kernel Modulation [33.191479192580275]
Few-shot image generation (F SIG) aims to generate new and diverse samples given an extremely limited number of samples from a domain.
Recent work has addressed the problem using transfer learning approach, leveraging a GAN pretrained on a large-scale source domain dataset.
We propose Adaptation-Aware kernel Modulation (AdAM) to address general F SIG of different source-target domain proximity.
arXiv Detail & Related papers (2022-10-29T10:26:40Z) - MultiMatch: Multi-task Learning for Semi-supervised Domain Generalization [55.06956781674986]
We resort to solving the semi-supervised domain generalization task, where there are a few label information in each source domain.
We propose MultiMatch, extending FixMatch to the multi-task learning framework, producing the high-quality pseudo-label for SSDG.
A series of experiments validate the effectiveness of the proposed method, and it outperforms the existing semi-supervised methods and the SSDG method on several benchmark DG datasets.
arXiv Detail & Related papers (2022-08-11T14:44:33Z) - META: Mimicking Embedding via oThers' Aggregation for Generalizable
Person Re-identification [68.39849081353704]
Domain generalizable (DG) person re-identification (ReID) aims to test across unseen domains without access to the target domain data at training time.
This paper presents a new approach called Mimicking Embedding via oThers' Aggregation (META) for DG ReID.
arXiv Detail & Related papers (2021-12-16T08:06:50Z) - Multilevel Knowledge Transfer for Cross-Domain Object Detection [26.105283273950942]
Domain shift is a well known problem where a model trained on a particular domain (source) does not perform well when exposed to samples from a different domain (target)
In this work, we address the domain shift problem for the object detection task.
Our approach relies on gradually removing the domain shift between the source and the target domains.
arXiv Detail & Related papers (2021-08-02T15:24:40Z) - Multi-Target Domain Adaptation with Collaborative Consistency Learning [105.7615147382486]
We propose a collaborative learning framework to achieve unsupervised multi-target domain adaptation.
The proposed method can effectively exploit rich structured information contained in both labeled source domain and multiple unlabeled target domains.
arXiv Detail & Related papers (2021-06-07T08:36:20Z) - Contradistinguisher: A Vapnik's Imperative to Unsupervised Domain
Adaptation [7.538482310185133]
We propose a model referred Contradistinguisher that learns contrastive features and whose objective is to jointly learn to contradistinguish the unlabeled target domain in an unsupervised way.
We achieve the state-of-the-art on Office-31 and VisDA-2017 datasets in both single-source and multi-source settings.
arXiv Detail & Related papers (2020-05-25T19:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.