Domain-oriented Language Pre-training with Adaptive Hybrid Masking and
Optimal Transport Alignment
- URL: http://arxiv.org/abs/2112.03024v1
- Date: Wed, 1 Dec 2021 15:47:01 GMT
- Title: Domain-oriented Language Pre-training with Adaptive Hybrid Masking and
Optimal Transport Alignment
- Authors: Denghui Zhang, Zixuan Yuan, Yanchi Liu, Hao Liu, Fuzhen Zhuang, Hui
Xiong, Haifeng Chen
- Abstract summary: We provide a general domain-oriented approach to adapt pre-trained language models for different application domains.
To preserve phrase knowledge effectively, we build a domain phrase pool as auxiliary training tool.
We introduce Cross Entity Alignment to leverage entity association as weak supervision to augment the semantic learning of pre-trained models.
- Score: 43.874781718934486
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Motivated by the success of pre-trained language models such as BERT in a
broad range of natural language processing (NLP) tasks, recent research efforts
have been made for adapting these models for different application domains.
Along this line, existing domain-oriented models have primarily followed the
vanilla BERT architecture and have a straightforward use of the domain corpus.
However, domain-oriented tasks usually require accurate understanding of domain
phrases, and such fine-grained phrase-level knowledge is hard to be captured by
existing pre-training scheme. Also, the word co-occurrences guided semantic
learning of pre-training models can be largely augmented by entity-level
association knowledge. But meanwhile, by doing so there is a risk of
introducing noise due to the lack of groundtruth word-level alignment. To
address the above issues, we provide a generalized domain-oriented approach,
which leverages auxiliary domain knowledge to improve the existing pre-training
framework from two aspects. First, to preserve phrase knowledge effectively, we
build a domain phrase pool as auxiliary training tool, meanwhile we introduce
Adaptive Hybrid Masked Model to incorporate such knowledge. It integrates two
learning modes, word learning and phrase learning, and allows them to switch
between each other. Second, we introduce Cross Entity Alignment to leverage
entity association as weak supervision to augment the semantic learning of
pre-trained models. To alleviate the potential noise in this process, we
introduce an interpretable Optimal Transport based approach to guide alignment
learning. Experiments on four domain-oriented tasks demonstrate the superiority
of our framework.
Related papers
- Adapt in Contexts: Retrieval-Augmented Domain Adaptation via In-Context
Learning [48.22913073217633]
Large language models (LLMs) have showcased their capability with few-shot inference known as in-context learning.
In this paper, we study the UDA problem under an in-context learning setting to adapt language models from the source domain to the target domain without any target labels.
We devise different prompting and training strategies, accounting for different LM architectures to learn the target distribution via language modeling.
arXiv Detail & Related papers (2023-11-20T06:06:20Z) - Adapting a Language Model While Preserving its General Knowledge [22.083108548675494]
Domain-adaptive pre-training (or DA-training for short) aims to train a pre-trained general-purpose language model (LM) using an unlabeled corpus of a particular domain to adapt the LM.
Existing DA-training methods are in some sense blind as they do not explicitly identify what knowledge in the LM should be preserved and what should be changed by the domain corpus.
This paper shows that the existing methods are suboptimal and proposes a novel method to perform a more informed adaptation of the knowledge in the LM.
arXiv Detail & Related papers (2023-01-21T17:57:53Z) - Prior Knowledge Guided Unsupervised Domain Adaptation [82.9977759320565]
We propose a Knowledge-guided Unsupervised Domain Adaptation (KUDA) setting where prior knowledge about the target class distribution is available.
In particular, we consider two specific types of prior knowledge about the class distribution in the target domain: Unary Bound and Binary Relationship.
We propose a rectification module that uses such prior knowledge to refine model generated pseudo labels.
arXiv Detail & Related papers (2022-07-18T18:41:36Z) - Domain Adapting Speech Emotion Recognition modals to real-world scenario
with Deep Reinforcement Learning [5.40755576668989]
Domain adaptation allows us to transfer knowledge learnt by a model across domains after a phase of training.
We present a deep reinforcement learning-based strategy for adapting a pre-trained model to a newer domain.
arXiv Detail & Related papers (2022-07-07T02:53:39Z) - KALA: Knowledge-Augmented Language Model Adaptation [65.92457495576141]
We propose a novel domain adaption framework for pre-trained language models (PLMs)
Knowledge-Augmented Language model Adaptation (KALA) modulates the intermediate hidden representations of PLMs with domain knowledge.
Results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training.
arXiv Detail & Related papers (2022-04-22T08:11:59Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Unsupervised Domain Adaptation for Semantic Segmentation via Low-level
Edge Information Transfer [27.64947077788111]
Unsupervised domain adaptation for semantic segmentation aims to make models trained on synthetic data adapt to real images.
Previous feature-level adversarial learning methods only consider adapting models on the high-level semantic features.
We present the first attempt at explicitly using low-level edge information, which has a small inter-domain gap, to guide the transfer of semantic information.
arXiv Detail & Related papers (2021-09-18T11:51:31Z) - Neural Supervised Domain Adaptation by Augmenting Pre-trained Models
with Random Units [14.183224769428843]
Neural Transfer Learning (TL) is becoming ubiquitous in Natural Language Processing (NLP)
In this paper, we show through interpretation methods that such scheme, despite its efficiency, is suffering from a main limitation.
We propose to augment the pre-trained model with normalised, weighted and randomly initialised units that foster a better adaptation while maintaining the valuable source knowledge.
arXiv Detail & Related papers (2021-06-09T09:29:11Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.