One Model to Recognize Them All: Marginal Distillation from NER Models
with Different Tag Sets
- URL: http://arxiv.org/abs/2004.05140v2
- Date: Fri, 17 Apr 2020 13:55:24 GMT
- Title: One Model to Recognize Them All: Marginal Distillation from NER Models
with Different Tag Sets
- Authors: Keunwoo Peter Yu and Yi Yang
- Abstract summary: Named entity recognition (NER) is a fundamental component in the modern language understanding pipeline.
This paper presents a marginal distillation (MARDI) approach for training a unified NER model from resources with disjoint or heterogeneous tag sets.
- Score: 30.445201832698192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named entity recognition (NER) is a fundamental component in the modern
language understanding pipeline. Public NER resources such as annotated data
and model services are available in many domains. However, given a particular
downstream application, there is often no single NER resource that supports all
the desired entity types, so users must leverage multiple resources with
different tag sets. This paper presents a marginal distillation (MARDI)
approach for training a unified NER model from resources with disjoint or
heterogeneous tag sets. In contrast to recent works, MARDI merely requires
access to pre-trained models rather than the original training datasets. This
flexibility makes it easier to work with sensitive domains like healthcare and
finance. Furthermore, our approach is general enough to integrate with
different NER architectures, including local models (e.g., BiLSTM) and global
models (e.g., CRF). Experiments on two benchmark datasets show that MARDI
performs on par with a strong marginal CRF baseline, while being more flexible
in the form of required NER resources. MARDI also sets a new state of the art
on the progressive NER task. MARDI significantly outperforms the
start-of-the-art model on the task of progressive NER.
Related papers
- GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models [7.714969840571947]
We introduce the task of generation-based extraction and in-context classification (GEIC)
We then propose CascadeNER, a universal and multilingual GEIC framework for few-shot and zero-shot NER.
We also introduce AnythingNER, the first NER dataset specifically designed for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-09-17T09:32:12Z) - Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning [32.62763647036567]
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples.
We propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER)
MsFNER splits the general NER into two stages: entity-span detection and entity classification.
arXiv Detail & Related papers (2024-04-10T12:31:09Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - Enhancing Few-shot NER with Prompt Ordering based Data Augmentation [59.69108119752584]
We propose a Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks.
Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-05-19T16:25:43Z) - One Model for All Domains: Collaborative Domain-Prefix Tuning for
Cross-Domain NER [92.79085995361098]
Cross-domain NER is a challenging task to address the low-resource problem in practical scenarios.
Previous solutions mainly obtain a NER model by pre-trained language models (PLMs) with data from a rich-resource domain and adapt it to the target domain.
We introduce Collaborative Domain-Prefix Tuning for cross-domain NER based on text-to-text generative PLMs.
arXiv Detail & Related papers (2023-01-25T05:16:43Z) - Simple Questions Generate Named Entity Recognition Datasets [18.743889213075274]
This work introduces an ask-to-generate approach, which automatically generates NER datasets by asking simple natural language questions.
Our models largely outperform previous weakly supervised models on six NER benchmarks across four different domains.
Formulating the needs of NER with natural language also allows us to build NER models for fine-grained entity types such as Award.
arXiv Detail & Related papers (2021-12-16T11:44:38Z) - NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging [40.57720568571513]
We construct a massive NER corpus with a relatively high quality, and we pre-train a NER-BERT model based on the created dataset.
Experimental results show that our pre-trained model can significantly outperform BERT as well as other strong baselines in low-resource scenarios.
arXiv Detail & Related papers (2021-12-01T10:45:02Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.