Knowledge Distillation for BERT Unsupervised Domain Adaptation
- URL: http://arxiv.org/abs/2010.11478v2
- Date: Fri, 23 Oct 2020 02:12:06 GMT
- Title: Knowledge Distillation for BERT Unsupervised Domain Adaptation
- Authors: Minho Ryu and Kichun Lee
- Abstract summary: A pre-trained language model, BERT, has brought significant performance improvements across a range of natural language processing tasks.
We propose a simple but effective unsupervised domain adaptation method, adversarial adaptation with distillation (AAD)
We evaluate our approach in the task of cross-domain sentiment classification on 30 domain pairs.
- Score: 2.969705152497174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A pre-trained language model, BERT, has brought significant performance
improvements across a range of natural language processing tasks. Since the
model is trained on a large corpus of diverse topics, it shows robust
performance for domain shift problems in which data distributions at training
(source data) and testing (target data) differ while sharing similarities.
Despite its great improvements compared to previous models, it still suffers
from performance degradation due to domain shifts. To mitigate such problems,
we propose a simple but effective unsupervised domain adaptation method,
adversarial adaptation with distillation (AAD), which combines the adversarial
discriminative domain adaptation (ADDA) framework with knowledge distillation.
We evaluate our approach in the task of cross-domain sentiment classification
on 30 domain pairs, advancing the state-of-the-art performance for unsupervised
domain adaptation in text sentiment classification.
Related papers
- Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition [1.2878987353423252]
Unsupervised domain adaptation (UDA) has become increasingly prevalent in scene text recognition (STR)
We introduce the Stratified Domain Adaptation (StrDA) approach, which examines the gradual escalation of the domain gap for the learning process.
We propose a novel method for employing domain discriminators to estimate the out-of-distribution and domain discriminative levels of data samples.
arXiv Detail & Related papers (2024-10-13T16:40:48Z) - Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data.
Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Boosting Cross-Domain Speech Recognition with Self-Supervision [35.01508881708751]
Cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to mismatch between training and testing distributions.
Previous work has shown that self-supervised learning (SSL) or pseudo-labeling (PL) is effective in UDA by exploiting the self-supervisions of unlabeled data.
This work presents a systematic UDA framework to fully utilize the unlabeled data with self-supervision in the pre-training and fine-tuning paradigm.
arXiv Detail & Related papers (2022-06-20T14:02:53Z) - Semi-Supervised Adversarial Discriminative Domain Adaptation [18.15464889789663]
Domain adaptation is a potential method to train a powerful deep neural network, which can handle the absence of labeled data.
In this paper, we propose an improved adversarial domain adaptation method called Semi-Supervised Adversarial Discriminative Domain Adaptation (SADDA)
arXiv Detail & Related papers (2021-09-27T12:52:50Z) - Improving Transferability of Domain Adaptation Networks Through Domain
Alignment Layers [1.3766148734487902]
Multi-source unsupervised domain adaptation (MSDA) aims at learning a predictor for an unlabeled domain by assigning weak knowledge from a bag of source models.
We propose to embed Multi-Source version of DomaIn Alignment Layers (MS-DIAL) at different levels of the predictor.
Our approach can improve state-of-the-art MSDA methods, yielding relative gains of up to +30.64% on their classification accuracies.
arXiv Detail & Related papers (2021-09-06T18:41:19Z) - UDALM: Unsupervised Domain Adaptation through Language Modeling [79.73916345178415]
We introduce UDALM, a fine-tuning procedure, using a mixed classification and Masked Language Model loss.
Our experiments show that performance of models trained with the mixed loss scales with the amount of available target data can be effectively used as a stopping criterion.
Our method is evaluated on twelve domain pairs of the Amazon Reviews Sentiment dataset, yielding $91.74%$ accuracy, which is an $1.11%$ absolute improvement over the state-of-versathe-art.
arXiv Detail & Related papers (2021-04-14T19:05:01Z) - Selective Pseudo-Labeling with Reinforcement Learning for
Semi-Supervised Domain Adaptation [116.48885692054724]
We propose a reinforcement learning based selective pseudo-labeling method for semi-supervised domain adaptation.
We develop a deep Q-learning model to select both accurate and representative pseudo-labeled instances.
Our proposed method is evaluated on several benchmark datasets for SSDA, and demonstrates superior performance to all the comparison methods.
arXiv Detail & Related papers (2020-12-07T03:37:38Z) - Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation [66.74638960925854]
Partial domain adaptation (PDA) deals with a realistic and challenging problem when the source domain label space substitutes the target domain.
We propose an Adaptively-Accumulated Knowledge Transfer framework (A$2$KT) to align the relevant categories across two domains.
arXiv Detail & Related papers (2020-08-27T00:53:43Z) - Sequential Domain Adaptation through Elastic Weight Consolidation for
Sentiment Analysis [3.1473798197405944]
We propose a model-independent framework - Sequential Domain Adaptation (SDA)
Our experiments show that the proposed framework enables simple architectures such as CNNs to outperform complex state-of-the-art models in domain adaptation of sentiment analysis (SA)
In addition, we observe that the effectiveness of a harder first Anti-Curriculum ordering of source domains leads to maximum performance.
arXiv Detail & Related papers (2020-07-02T15:21:56Z) - Towards Fair Cross-Domain Adaptation via Generative Learning [50.76694500782927]
Domain Adaptation (DA) targets at adapting a model trained over the well-labeled source domain to the unlabeled target domain lying in different distributions.
We develop a novel Generative Few-shot Cross-domain Adaptation (GFCA) algorithm for fair cross-domain classification.
arXiv Detail & Related papers (2020-03-04T23:25:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.