DILBERT: Customized Pre-Training for Domain Adaptation withCategory
Shift, with an Application to Aspect Extraction
- URL: http://arxiv.org/abs/2109.00571v1
- Date: Wed, 1 Sep 2021 18:49:44 GMT
- Title: DILBERT: Customized Pre-Training for Domain Adaptation withCategory
Shift, with an Application to Aspect Extraction
- Authors: Entony Lekhtman, Yftah Ziser, Roi Reichart
- Abstract summary: A generic approach towards the pre-training procedure can naturally be sub-optimal in some cases.
This paper presents a new fine-tuning scheme for BERT, which aims to address the above challenges.
We name this scheme DILBERT: Domain Invariant Learning with BERT, and customize it for aspect extraction in the unsupervised domain adaptation setting.
- Score: 25.075552473110676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rise of pre-trained language models has yielded substantial progress in
the vast majority of Natural Language Processing (NLP) tasks. However, a
generic approach towards the pre-training procedure can naturally be
sub-optimal in some cases. Particularly, fine-tuning a pre-trained language
model on a source domain and then applying it to a different target domain,
results in a sharp performance decline of the eventual classifier for many
source-target domain pairs. Moreover, in some NLP tasks, the output categories
substantially differ between domains, making adaptation even more challenging.
This, for example, happens in the task of aspect extraction, where the aspects
of interest of reviews of, e.g., restaurants or electronic devices may be very
different. This paper presents a new fine-tuning scheme for BERT, which aims to
address the above challenges. We name this scheme DILBERT: Domain Invariant
Learning with BERT, and customize it for aspect extraction in the unsupervised
domain adaptation setting. DILBERT harnesses the categorical information of
both the source and the target domains to guide the pre-training process
towards a more domain and category invariant representation, thus closing the
gap between the domains. We show that DILBERT yields substantial improvements
over state-of-the-art baselines while using a fraction of the unlabeled data,
particularly in more challenging domain adaptation setups.
Related papers
- Enhancing Domain Adaptation through Prompt Gradient Alignment [16.618313165111793]
We develop a line of works based on prompt learning to learn both domain-invariant and specific features.
We cast UDA as a multiple-objective optimization problem in which each objective is represented by a domain loss.
Our method consistently surpasses other prompt-based baselines by a large margin on different UDA benchmarks.
arXiv Detail & Related papers (2024-06-13T17:40:15Z) - NormAUG: Normalization-guided Augmentation for Domain Generalization [60.159546669021346]
We propose a simple yet effective method called NormAUG (Normalization-guided Augmentation) for deep learning.
Our method introduces diverse information at the feature level and improves the generalization of the main path.
In the test stage, we leverage an ensemble strategy to combine the predictions from the auxiliary path of our model, further boosting performance.
arXiv Detail & Related papers (2023-07-25T13:35:45Z) - Domain Adaptation from Scratch [24.612696638386623]
We present a new learning setup, domain adaptation from scratch'', which we believe to be crucial for extending the reach of NLP to sensitive domains.
In this setup, we aim to efficiently annotate data from a set of source domains such that the trained model performs well on a sensitive target domain.
Our study compares several approaches for this challenging setup, ranging from data selection and domain adaptation algorithms to active learning paradigms.
arXiv Detail & Related papers (2022-09-02T05:55:09Z) - From Big to Small: Adaptive Learning to Partial-Set Domains [94.92635970450578]
Domain adaptation targets at knowledge acquisition and dissemination from a labeled source domain to an unlabeled target domain under distribution shift.
Recent advances show that deep pre-trained models of large scale endow rich knowledge to tackle diverse downstream tasks of small scale.
This paper introduces Partial Domain Adaptation (PDA), a learning paradigm that relaxes the identical class space assumption to that the source class space subsumes the target class space.
arXiv Detail & Related papers (2022-03-14T07:02:45Z) - Domain Adaptation via Prompt Learning [39.97105851723885]
Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain.
We introduce a novel prompt learning paradigm for UDA, named Domain Adaptation via Prompt Learning (DAPL)
arXiv Detail & Related papers (2022-02-14T13:25:46Z) - Instance Level Affinity-Based Transfer for Unsupervised Domain
Adaptation [74.71931918541748]
We propose an instance affinity based criterion for source to target transfer during adaptation, called ILA-DA.
We first propose a reliable and efficient method to extract similar and dissimilar samples across source and target, and utilize a multi-sample contrastive loss to drive the domain alignment process.
We verify the effectiveness of ILA-DA by observing consistent improvements in accuracy over popular domain adaptation approaches on a variety of benchmark datasets.
arXiv Detail & Related papers (2021-04-03T01:33:14Z) - Curriculum CycleGAN for Textual Sentiment Domain Adaptation with
Multiple Sources [68.31273535702256]
We propose a novel instance-level MDA framework, named curriculum cycle-consistent generative adversarial network (C-CycleGAN)
C-CycleGAN consists of three components: (1) pre-trained text encoder which encodes textual input from different domains into a continuous representation space, (2) intermediate domain generator with curriculum instance-level adaptation which bridges the gap across source and target domains, and (3) task classifier trained on the intermediate domain for final sentiment classification.
We conduct extensive experiments on three benchmark datasets and achieve substantial gains over state-of-the-art DA approaches.
arXiv Detail & Related papers (2020-11-17T14:50:55Z) - Knowledge Distillation for BERT Unsupervised Domain Adaptation [2.969705152497174]
A pre-trained language model, BERT, has brought significant performance improvements across a range of natural language processing tasks.
We propose a simple but effective unsupervised domain adaptation method, adversarial adaptation with distillation (AAD)
We evaluate our approach in the task of cross-domain sentiment classification on 30 domain pairs.
arXiv Detail & Related papers (2020-10-22T06:51:24Z) - Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation [66.74638960925854]
Partial domain adaptation (PDA) deals with a realistic and challenging problem when the source domain label space substitutes the target domain.
We propose an Adaptively-Accumulated Knowledge Transfer framework (A$2$KT) to align the relevant categories across two domains.
arXiv Detail & Related papers (2020-08-27T00:53:43Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - Learning to adapt class-specific features across domains for semantic
segmentation [36.36210909649728]
In this thesis, we present a novel architecture, which learns to adapt features across domains by taking into account per class information.
We adopt the recently introduced StarGAN architecture as image translation backbone, since it is able to perform translations across multiple domains by means of a single generator network.
arXiv Detail & Related papers (2020-01-22T23:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.