NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging
- URL: http://arxiv.org/abs/2112.00405v1
- Date: Wed, 1 Dec 2021 10:45:02 GMT
- Title: NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging
- Authors: Zihan Liu, Feijun Jiang, Yuxiang Hu, Chen Shi, Pascale Fung
- Abstract summary: We construct a massive NER corpus with a relatively high quality, and we pre-train a NER-BERT model based on the created dataset.
Experimental results show that our pre-trained model can significantly outperform BERT as well as other strong baselines in low-resource scenarios.
- Score: 40.57720568571513
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Named entity recognition (NER) models generally perform poorly when large
training datasets are unavailable for low-resource domains. Recently,
pre-training a large-scale language model has become a promising direction for
coping with the data scarcity issue. However, the underlying discrepancies
between the language modeling and NER task could limit the models' performance,
and pre-training for the NER task has rarely been studied since the collected
NER datasets are generally small or large but with low quality. In this paper,
we construct a massive NER corpus with a relatively high quality, and we
pre-train a NER-BERT model based on the created dataset. Experimental results
show that our pre-trained model can significantly outperform BERT as well as
other strong baselines in low-resource scenarios across nine diverse domains.
Moreover, a visualization of entity representations further indicates the
effectiveness of NER-BERT for categorizing a variety of entities.
Related papers
- Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models [0.0]
Few-Shot Prompting or in-context learning enables models to recognize entities with minimal examples.
We assess state-of-the-art models like GPT-4 in NER tasks, comparing their few-shot performance to fully supervised benchmarks.
arXiv Detail & Related papers (2024-08-28T13:42:28Z) - What do we Really Know about State of the Art NER? [0.0]
We perform a broad evaluation of NER using a popular dataset.
We generate six new adversarial test sets through small perturbations in the original test set.
We train and test our models on randomly generated train/dev/test splits followed by an experiment where the models are trained on a select set of genres but tested genres not seen in training.
arXiv Detail & Related papers (2022-04-29T18:35:53Z) - RockNER: A Simple Method to Create Adversarial Examples for Evaluating
the Robustness of Named Entity Recognition Models [32.806292167848156]
We propose RockNER to audit the robustness of named entity recognition models.
We replace target entities with other entities of the same semantic class in Wikidata.
At the context level, we use pre-trained language models to generate word substitutions.
arXiv Detail & Related papers (2021-09-12T21:30:21Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - One Model to Recognize Them All: Marginal Distillation from NER Models
with Different Tag Sets [30.445201832698192]
Named entity recognition (NER) is a fundamental component in the modern language understanding pipeline.
This paper presents a marginal distillation (MARDI) approach for training a unified NER model from resources with disjoint or heterogeneous tag sets.
arXiv Detail & Related papers (2020-04-10T17:36:27Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.