Semi-Supervised Disentangled Framework for Transferable Named Entity
Recognition
- URL: http://arxiv.org/abs/2012.11805v1
- Date: Tue, 22 Dec 2020 02:55:04 GMT
- Title: Semi-Supervised Disentangled Framework for Transferable Named Entity
Recognition
- Authors: Zhifeng Hao, Di Lv, Zijian Li, Ruichu Cai, Wen Wen, Boyan Xu
- Abstract summary: We present a semi-supervised framework for transferable NER, which disentangles the domain-invariant latent variables and domain-specific latent variables.
Our model can obtain state-of-the-art performance with cross-domain and cross-lingual NER benchmark data sets.
- Score: 27.472171967604602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named entity recognition (NER) for identifying proper nouns in unstructured
text is one of the most important and fundamental tasks in natural language
processing. However, despite the widespread use of NER models, they still
require a large-scale labeled data set, which incurs a heavy burden due to
manual annotation. Domain adaptation is one of the most promising solutions to
this problem, where rich labeled data from the relevant source domain are
utilized to strengthen the generalizability of a model based on the target
domain. However, the mainstream cross-domain NER models are still affected by
the following two challenges (1) Extracting domain-invariant information such
as syntactic information for cross-domain transfer. (2) Integrating
domain-specific information such as semantic information into the model to
improve the performance of NER. In this study, we present a semi-supervised
framework for transferable NER, which disentangles the domain-invariant latent
variables and domain-specific latent variables. In the proposed framework, the
domain-specific information is integrated with the domain-specific latent
variables by using a domain predictor. The domain-specific and domain-invariant
latent variables are disentangled using three mutual information regularization
terms, i.e., maximizing the mutual information between the domain-specific
latent variables and the original embedding, maximizing the mutual information
between the domain-invariant latent variables and the original embedding, and
minimizing the mutual information between the domain-specific and
domain-invariant latent variables. Extensive experiments demonstrated that our
model can obtain state-of-the-art performance with cross-domain and
cross-lingual NER benchmark data sets.
Related papers
- Compound Domain Generalization via Meta-Knowledge Encoding [55.22920476224671]
We introduce Style-induced Domain-specific Normalization (SDNorm) to re-normalize the multi-modal underlying distributions.
We harness the prototype representations, the centroids of classes, to perform relational modeling in the embedding space.
Experiments on four standard Domain Generalization benchmarks reveal that COMEN exceeds the state-of-the-art performance without the need of domain supervision.
arXiv Detail & Related papers (2022-03-24T11:54:59Z) - TAL: Two-stream Adaptive Learning for Generalizable Person
Re-identification [115.31432027711202]
We argue that both domain-specific and domain-invariant features are crucial for improving the generalization ability of re-id models.
We name two-stream adaptive learning (TAL) to simultaneously model these two kinds of information.
Our framework can be applied to both single-source and multi-source domain generalization tasks.
arXiv Detail & Related papers (2021-11-29T01:27:42Z) - Exploiting Both Domain-specific and Invariant Knowledge via a Win-win
Transformer for Unsupervised Domain Adaptation [14.623272346517794]
Unsupervised Domain Adaptation (UDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain.
Most existing UDA approaches enable knowledge transfer via learning domain-invariant representation and sharing one classifier across two domains.
We propose a Win-Win TRansformer framework (WinTR) that separately explores the domain-specific knowledge for each domain and interchanges cross-domain knowledge.
arXiv Detail & Related papers (2021-11-25T06:45:07Z) - Exploiting Domain-Specific Features to Enhance Domain Generalization [10.774902700296249]
Domain Generalization (DG) aims to train a model, from multiple observed source domains, in order to perform well on unseen target domains.
Prior DG approaches have focused on extracting domain-invariant information across sources to generalize on target domains.
We propose meta-Domain Specific-Domain Invariant (mD) - a novel theoretically sound framework.
arXiv Detail & Related papers (2021-10-18T15:42:39Z) - Self-Adversarial Disentangling for Specific Domain Adaptation [52.1935168534351]
Domain adaptation aims to bridge the domain shifts between the source and target domains.
Recent methods typically do not consider explicit prior knowledge on a specific dimension.
arXiv Detail & Related papers (2021-08-08T02:36:45Z) - Adaptive Domain-Specific Normalization for Generalizable Person
Re-Identification [81.30327016286009]
We propose a novel adaptive domain-specific normalization approach (AdsNorm) for generalizable person Re-ID.
In this work, we propose a novel adaptive domain-specific normalization approach (AdsNorm) for generalizable person Re-ID.
arXiv Detail & Related papers (2021-05-07T02:54:55Z) - Learning Disentangled Semantic Representation for Domain Adaptation [39.055191615410244]
We aim to extract the domain invariant semantic information in the latent disentangled semantic representation of the data.
Under the above assumption, we employ a variational auto-encoder to reconstruct the semantic latent variables and domain latent variables.
We devise a dual adversarial network to disentangle these two sets of reconstructed latent variables.
arXiv Detail & Related papers (2020-12-22T03:03:36Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - Bi-Directional Generation for Unsupervised Domain Adaptation [61.73001005378002]
Unsupervised domain adaptation facilitates the unlabeled target domain relying on well-established source domain information.
Conventional methods forcefully reducing the domain discrepancy in the latent space will result in the destruction of intrinsic data structure.
We propose a Bi-Directional Generation domain adaptation model with consistent classifiers interpolating two intermediate domains to bridge source and target domains.
arXiv Detail & Related papers (2020-02-12T09:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.