What do we Really Know about State of the Art NER?
- URL: http://arxiv.org/abs/2205.00034v1
- Date: Fri, 29 Apr 2022 18:35:53 GMT
- Title: What do we Really Know about State of the Art NER?
- Authors: Sowmya Vajjala and Ramya Balasubramaniam
- Abstract summary: We perform a broad evaluation of NER using a popular dataset.
We generate six new adversarial test sets through small perturbations in the original test set.
We train and test our models on randomly generated train/dev/test splits followed by an experiment where the models are trained on a select set of genres but tested genres not seen in training.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Named Entity Recognition (NER) is a well researched NLP task and is widely
used in real world NLP scenarios. NER research typically focuses on the
creation of new ways of training NER, with relatively less emphasis on
resources and evaluation. Further, state of the art (SOTA) NER models, trained
on standard datasets, typically report only a single performance measure
(F-score) and we don't really know how well they do for different entity types
and genres of text, or how robust are they to new, unseen entities. In this
paper, we perform a broad evaluation of NER using a popular dataset, that takes
into consideration various text genres and sources constituting the dataset at
hand. Additionally, we generate six new adversarial test sets through small
perturbations in the original test set, replacing select entities while
retaining the context. We also train and test our models on randomly generated
train/dev/test splits followed by an experiment where the models are trained on
a select set of genres but tested genres not seen in training. These
comprehensive evaluation strategies were performed using three SOTA NER models.
Based on our results, we recommend some useful reporting practices for NER
researchers, that could help in providing a better understanding of a SOTA
model's performance in future.
Related papers
- Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - UniversalNER: Targeted Distillation from Large Language Models for Open
Named Entity Recognition [48.977866466971655]
We show how ChatGPT can be distilled into much smaller UniversalNER models for open NER.
We assemble the largest NER benchmark to date, comprising 43 datasets across 9 diverse domains.
With a tiny fraction of parameters, UniversalNER not only acquires ChatGPT's capability in recognizing arbitrary entity types, but also outperforms its NER accuracy by 7-9 absolute F1 points in average.
arXiv Detail & Related papers (2023-08-07T03:39:52Z) - Simple Questions Generate Named Entity Recognition Datasets [18.743889213075274]
This work introduces an ask-to-generate approach, which automatically generates NER datasets by asking simple natural language questions.
Our models largely outperform previous weakly supervised models on six NER benchmarks across four different domains.
Formulating the needs of NER with natural language also allows us to build NER models for fine-grained entity types such as Award.
arXiv Detail & Related papers (2021-12-16T11:44:38Z) - NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging [40.57720568571513]
We construct a massive NER corpus with a relatively high quality, and we pre-train a NER-BERT model based on the created dataset.
Experimental results show that our pre-trained model can significantly outperform BERT as well as other strong baselines in low-resource scenarios.
arXiv Detail & Related papers (2021-12-01T10:45:02Z) - RockNER: A Simple Method to Create Adversarial Examples for Evaluating
the Robustness of Named Entity Recognition Models [32.806292167848156]
We propose RockNER to audit the robustness of named entity recognition models.
We replace target entities with other entities of the same semantic class in Wikidata.
At the context level, we use pre-trained language models to generate word substitutions.
arXiv Detail & Related papers (2021-09-12T21:30:21Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Few-NERD: A Few-Shot Named Entity Recognition Dataset [35.669024917327825]
We present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.
Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type.
arXiv Detail & Related papers (2021-05-16T15:53:17Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - One Model to Recognize Them All: Marginal Distillation from NER Models
with Different Tag Sets [30.445201832698192]
Named entity recognition (NER) is a fundamental component in the modern language understanding pipeline.
This paper presents a marginal distillation (MARDI) approach for training a unified NER model from resources with disjoint or heterogeneous tag sets.
arXiv Detail & Related papers (2020-04-10T17:36:27Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.