Annotation Errors and NER: A Study with OntoNotes 5.0
- URL: http://arxiv.org/abs/2406.19172v1
- Date: Thu, 27 Jun 2024 13:48:46 GMT
- Title: Annotation Errors and NER: A Study with OntoNotes 5.0
- Authors: Gabriel Bernier-Colborne, Sowmya Vajjala,
- Abstract summary: We employ three simple techniques to detect annotation errors in the OntoNotes 5.0 corpus for English NER.
Our techniques corrected 10% of the sentences in train/dev/test data.
We used three NER libraries to train, evaluate and compare the models trained with the original and the re-annotated datasets.
- Score: 2.8544822698499255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named Entity Recognition (NER) is a well-studied problem in NLP. However, there is much less focus on studying NER datasets, compared to developing new NER models. In this paper, we employed three simple techniques to detect annotation errors in the OntoNotes 5.0 corpus for English NER, which is the largest available NER corpus for English. Our techniques corrected ~10% of the sentences in train/dev/test data. In terms of entity mentions, we corrected the span and/or type of ~8% of mentions in the dataset, while adding/deleting/splitting/merging a few more. These are large numbers of changes, considering the size of OntoNotes. We used three NER libraries to train, evaluate and compare the models trained with the original and the re-annotated datasets, which showed an average improvement of 1.23% in overall F-scores, with large (>10%) improvements for some of the entity types. While our annotation error detection methods are not exhaustive and there is some manual annotation effort involved, they are largely language agnostic and can be employed with other NER datasets, and other sequence labelling tasks.
Related papers
- ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition [18.884124657093405]
We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules.
ELLEN achieves very strong performance on the CoNLL-2003 dataset.
In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data.
arXiv Detail & Related papers (2024-03-26T05:11:51Z) - NERetrieve: Dataset for Next Generation Named Entity Recognition and
Retrieval [49.827932299460514]
We argue that capabilities provided by large language models are not the end of NER research, but rather an exciting beginning.
We present three variants of the NER task, together with a dataset to support them.
We provide a large, silver-annotated corpus of 4 million paragraphs covering 500 entity types.
arXiv Detail & Related papers (2023-10-22T12:23:00Z) - A Multilingual Evaluation of NER Robustness to Adversarial Inputs [0.0]
Adversarial evaluations of language models typically focus on English alone.
In this paper, we performed a multilingual evaluation of Named Entity Recognition (NER) in terms of its robustness to small perturbations in the input.
We explored whether it is possible to improve the existing NER models using a part of the generated adversarial data sets as augmented training data to train a new NER model or as fine-tuning data to adapt an existing NER model.
arXiv Detail & Related papers (2023-05-30T10:50:49Z) - An Embarrassingly Easy but Strong Baseline for Nested Named Entity
Recognition [55.080101447586635]
We propose using Conal Neural Network (CNN) to model spatial relations in the score matrix.
Our model surpasses several recently proposed methods with the same pre-trained encoders.
arXiv Detail & Related papers (2022-08-09T04:33:46Z) - Annotation Error Detection: Analyzing the Past and Present for a More
Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets.
We define a uniform evaluation setup including a new formalization of the annotation error detection task.
We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z) - Efficient Few-Shot Fine-Tuning for Opinion Summarization [83.76460801568092]
Abstractive summarization models are typically pre-trained on large amounts of generic texts, then fine-tuned on tens or hundreds of thousands of annotated samples.
We show that a few-shot method based on adapters can easily store in-domain knowledge.
We show that this self-supervised adapter pre-training improves summary quality over standard fine-tuning by 2.0 and 1.3 ROUGE-L points on the Amazon and Yelp datasets.
arXiv Detail & Related papers (2022-05-04T16:38:37Z) - HiNER: A Large Hindi Named Entity Recognition Dataset [29.300418937509317]
This paper releases a standard-abiding Hindi NER dataset containing 109,146 sentences and 2,220,856 tokens, annotated with 11 tags.
The statistics of tag-set in our dataset show a healthy per-tag distribution, especially for prominent classes like Person, Location and Organisation.
Our dataset helps achieve a weighted F1 score of 88.78 with all the tags and 92.22 when we collapse the tag-set, as discussed in the paper.
arXiv Detail & Related papers (2022-04-28T19:14:21Z) - Few-NERD: A Few-Shot Named Entity Recognition Dataset [35.669024917327825]
We present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.
Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type.
arXiv Detail & Related papers (2021-05-16T15:53:17Z) - Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets.
However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality.
We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z) - Active Learning for Coreference Resolution using Discrete Annotation [76.36423696634584]
We improve upon pairwise annotation for active learning in coreference resolution.
We ask annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
In experiments with existing benchmark coreference datasets, we show that the signal from this additional question leads to significant performance gains per human-annotation hour.
arXiv Detail & Related papers (2020-04-28T17:17:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.