Few-NERD: A Few-Shot Named Entity Recognition Dataset
- URL: http://arxiv.org/abs/2105.07464v2
- Date: Wed, 19 May 2021 08:50:14 GMT
- Title: Few-NERD: A Few-Shot Named Entity Recognition Dataset
- Authors: Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie,
Hai-Tao Zheng, Zhiyuan Liu
- Abstract summary: We present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.
Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type.
- Score: 35.669024917327825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, considerable literature has grown up around the theme of few-shot
named entity recognition (NER), but little published benchmark data
specifically focused on the practical and challenging task. Current approaches
collect existing supervised NER datasets and re-organize them to the few-shot
setting for empirical study. These strategies conventionally aim to recognize
coarse-grained entity types with few examples, while in practice, most unseen
entity types are fine-grained. In this paper, we present Few-NERD, a
large-scale human-annotated few-shot NER dataset with a hierarchy of 8
coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238
sentences from Wikipedia, 4,601,160 words are included and each is annotated as
context or a part of a two-level entity type. To the best of our knowledge,
this is the first few-shot NER dataset and the largest human-crafted NER
dataset. We construct benchmark tasks with different emphases to
comprehensively assess the generalization capability of models. Extensive
empirical results and analysis show that Few-NERD is challenging and the
problem requires further research. We make Few-NERD public at
https://ningding97.github.io/fewnerd/.
Related papers
- Robust Few-Shot Named Entity Recognition with Boundary Discrimination
and Correlation Purification [14.998158107063848]
Few-shot named entity recognition (NER) aims to recognize novel named entities in low-resource domains utilizing existing knowledge.
We propose a robust two-stage few-shot NER method with Boundary Discrimination and Correlation Purification (BDCP)
In the span detection stage, the entity boundary discriminative module is introduced to provide a highly distinguishing boundary representation space to detect entity spans.
In the entity typing stage, the correlations between entities and contexts are purified by minimizing the interference information.
arXiv Detail & Related papers (2023-12-13T08:17:00Z) - NERetrieve: Dataset for Next Generation Named Entity Recognition and
Retrieval [49.827932299460514]
We argue that capabilities provided by large language models are not the end of NER research, but rather an exciting beginning.
We present three variants of the NER task, together with a dataset to support them.
We provide a large, silver-annotated corpus of 4 million paragraphs covering 500 entity types.
arXiv Detail & Related papers (2023-10-22T12:23:00Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - Dynamic Named Entity Recognition [5.9401550252715865]
We introduce a new task: Dynamic Named Entity Recognition (DNER)
DNER provides a framework to better evaluate the ability of algorithms to extract entities by exploiting the context.
We evaluate baseline models and present experiments reflecting issues and research axes related to this novel task.
arXiv Detail & Related papers (2023-02-16T15:50:02Z) - What do we Really Know about State of the Art NER? [0.0]
We perform a broad evaluation of NER using a popular dataset.
We generate six new adversarial test sets through small perturbations in the original test set.
We train and test our models on randomly generated train/dev/test splits followed by an experiment where the models are trained on a select set of genres but tested genres not seen in training.
arXiv Detail & Related papers (2022-04-29T18:35:53Z) - Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.
Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z) - MINER: Improving Out-of-Vocabulary Named Entity Recognition from an
Information Theoretic Perspective [57.19660234992812]
NER model has achieved promising performance on standard NER benchmarks.
Recent studies show that previous approaches may over-rely on entity mention information, resulting in poor performance on out-of-vocabulary (OOV) entity recognition.
We propose MINER, a novel NER learning framework, to remedy this issue from an information-theoretic perspective.
arXiv Detail & Related papers (2022-04-09T05:18:20Z) - NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging [40.57720568571513]
We construct a massive NER corpus with a relatively high quality, and we pre-train a NER-BERT model based on the created dataset.
Experimental results show that our pre-trained model can significantly outperform BERT as well as other strong baselines in low-resource scenarios.
arXiv Detail & Related papers (2021-12-01T10:45:02Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.