TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained
Language Models
- URL: http://arxiv.org/abs/2206.07841v1
- Date: Wed, 15 Jun 2022 22:49:14 GMT
- Title: TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained
Language Models
- Authors: Ali Davody, David Ifeoluwa Adelani, Thomas Kleinbauer, Dietrich Klakow
- Abstract summary: We propose a novel few-shot approach to domain adaptation in the context of Named Entity Recognition (NER)
We propose a two-step approach consisting of a variable base module and a template module that leverages the knowledge captured in pre-trained language models with the help of simple descriptive patterns.
Our approach is simple yet versatile and can be applied in few-shot and zero-shot settings.
- Score: 19.26653302753129
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transferring knowledge from one domain to another is of practical importance
for many tasks in natural language processing, especially when the amount of
available data in the target domain is limited. In this work, we propose a
novel few-shot approach to domain adaptation in the context of Named Entity
Recognition (NER). We propose a two-step approach consisting of a variable base
module and a template module that leverages the knowledge captured in
pre-trained language models with the help of simple descriptive patterns. Our
approach is simple yet versatile and can be applied in few-shot and zero-shot
settings. Evaluating our lightweight approach across a number of different
datasets shows that it can boost the performance of state-of-the-art baselines
by 2-5% F1-score.
Related papers
- Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition [3.695767900907561]
CLLMFS is a Contrastive Learning enhanced Large Language Model framework for Few-Shot Named Entity Recognition.
It integrates Low-Rank Adaptation (LoRA) and contrastive learning mechanisms specifically tailored for few-shot NER.
Our method has achieved state-of-the-art performance improvements on F1-score ranging from 2.58% to 97.74% over existing best-performing methods.
arXiv Detail & Related papers (2024-08-23T04:44:05Z) - Unified Language-driven Zero-shot Domain Adaptation [55.64088594551629]
Unified Language-driven Zero-shot Domain Adaptation (ULDA) is a novel task setting.
It enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.
arXiv Detail & Related papers (2024-04-10T16:44:11Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - VarMAE: Pre-training of Variational Masked Autoencoder for
Domain-adaptive Language Understanding [5.1282202633907]
We propose a novel Transformer-based language model named VarMAE for domain-adaptive language understanding.
Under the masked autoencoding objective, we design a context uncertainty learning module to encode the token's context into a smooth latent distribution.
Experiments on science- and finance-domain NLU tasks demonstrate that VarMAE can be efficiently adapted to new domains with limited resources.
arXiv Detail & Related papers (2022-11-01T12:51:51Z) - Effective Transfer Learning for Low-Resource Natural Language
Understanding [15.752309656576129]
We focus on developing cross-lingual and cross-domain methods to tackle the low-resource issues.
First, we propose to improve the model's cross-lingual ability by focusing on the task-related keywords.
Second, we present Order-Reduced Modeling methods for the cross-lingual adaptation.
Third, we propose to leverage different levels of domain-related corpora and additional masking of data in the pre-training for the cross-domain adaptation.
arXiv Detail & Related papers (2022-08-19T06:59:00Z) - CLIN-X: pre-trained language models and a study on cross-task transfer
for concept extraction in the clinical domain [22.846469609263416]
We introduce the pre-trained CLIN-X (Clinical XLM-R) language models and show how CLIN-X outperforms other pre-trained transformer models.
Our studies reveal stable model performance despite a lack of annotated data with improvements of up to 47 F1 points when only 250 labeled sentences are available.
Our results highlight the importance of specialized language models as CLIN-X for concept extraction in non-standard domains.
arXiv Detail & Related papers (2021-12-16T10:07:39Z) - Learning from Language Description: Low-shot Named Entity Recognition
via Decomposed Framework [23.501276952950366]
We propose a novel NER framework, namely SpanNER, which learns from natural language supervision and enables the identification of never-seen entity classes.
We perform extensive experiments on 5 benchmark datasets and evaluate the proposed method in the few-shot learning, domain transfer and zero-shot learning settings.
The experimental results show that the proposed method can bring 10%, 23% and 26% improvements in average over the best baselines in few-shot learning, domain transfer and zero-shot learning settings respectively.
arXiv Detail & Related papers (2021-09-11T19:52:09Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.