NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named
Entities
- URL: http://arxiv.org/abs/2210.11913v1
- Date: Fri, 21 Oct 2022 12:28:43 GMT
- Title: NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named
Entities
- Authors: Natalia Loukachevitch, Suresh Manandhar, Elina Baral, Igor Rozhkov,
Pavel Braslavski, Vladimir Ivanov, Tatiana Batura, and Elena Tutubalina
- Abstract summary: This paper describes NEREL-BIO -- an annotation scheme and corpus of PubMed abstracts in Russian and smaller number of abstracts in English.
NEREL-BIO extends the general domain dataset NEREL by introducing domain-specific entity types.
NEREL-BIO provides annotation for nested named entities as an extension of the scheme employed for NEREL.
- Score: 7.713462279125201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes NEREL-BIO -- an annotation scheme and corpus of PubMed
abstracts in Russian and smaller number of abstracts in English. NEREL-BIO
extends the general domain dataset NEREL by introducing domain-specific entity
types. NEREL-BIO annotation scheme covers both general and biomedical domains
making it suitable for domain transfer experiments. NEREL-BIO provides
annotation for nested named entities as an extension of the scheme employed for
NEREL. Nested named entities may cross entity boundaries to connect to shorter
entities nested within longer entities, making them harder to detect.
NEREL-BIO contains annotations for 700+ Russian and 100+ English abstracts.
All English PubMed annotations have corresponding Russian counterparts. Thus,
NEREL-BIO comprises the following specific features: annotation of nested named
entities, it can be used as a benchmark for cross-domain (NEREL -> NEREL-BIO)
and cross-language (English -> Russian) transfer. We experiment with both
transformer-based sequence models and machine reading comprehension (MRC)
models and report their results.
The dataset is freely available at https://github.com/nerel-ds/NEREL-BIO.
Related papers
- Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - Partial Annotation Learning for Biomedical Entity Recognition [0.19336815376402716]
We show that partial annotation learning methods can effectively learn from biomedical corpora with missing entity annotations.
Our proposed model outperforms alternatives and, specifically, the PubMedBERT tagger by 38% in F1-score under high missing entity rates.
arXiv Detail & Related papers (2023-05-22T15:18:38Z) - From Zero to Hero: Harnessing Transformers for Biomedical Named Entity Recognition in Zero- and Few-shot Contexts [0.0]
This paper proposes a method for zero- and few-shot NER in the biomedical domain.
We have achieved average F1 scores of 35.44% for zero-shot NER, 50.10% for one-shot NER, 69.94% for 10-shot NER, and 79.51% for 100-shot NER on 9 diverse evaluated biomedical entities.
arXiv Detail & Related papers (2023-05-05T12:14:22Z) - Enhancing Label Consistency on Document-level Named Entity Recognition [19.249781091058605]
Named entity recognition (NER) is a fundamental part of extracting information from documents in biomedical applications.
We present our method, ConNER, which enhances the label dependency of modifier (e.g., adjectives and prepositions) to achieve higher label agreement.
The effectiveness of our method is demonstrated on four popular biomedical NER datasets.
arXiv Detail & Related papers (2022-10-24T04:45:17Z) - Optimizing Bi-Encoder for Named Entity Recognition via Contrastive
Learning [80.36076044023581]
We present an efficient bi-encoder framework for named entity recognition (NER)
We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type.
A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
arXiv Detail & Related papers (2022-08-30T23:19:04Z) - Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.
Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z) - NEREL: A Russian Dataset with Nested Named Entities and Relations [55.69103749079697]
We present NEREL, a Russian dataset for named entity recognition and relation extraction.
It contains 56K annotated named entities and 39K annotated relations.
arXiv Detail & Related papers (2021-08-30T10:40:20Z) - MobIE: A German Dataset for Named Entity Recognition, Entity Linking and
Relation Extraction in the Mobility Domain [76.21775236904185]
dataset consists of 3,232 social media texts and traffic reports with 91K tokens, and contains 20.5K annotated entities.
A subset of the dataset is human-annotated with seven mobility-related, n-ary relation types.
To the best of our knowledge, this is the first German-language dataset that combines annotations for NER, EL and RE.
arXiv Detail & Related papers (2021-08-16T08:21:50Z) - BioALBERT: A Simple and Effective Pre-trained Language Model for
Biomedical Named Entity Recognition [9.05154470433578]
Existing BioNER approaches often neglect these issues and directly adopt the state-of-the-art (SOTA) models.
We propose biomedical ALBERT, an effective domain-specific language model trained on large-scale biomedical corpora.
arXiv Detail & Related papers (2020-09-19T12:58:47Z) - Bipartite Flat-Graph Network for Nested Named Entity Recognition [94.91507634620133]
Bipartite flat-graph network (BiFlaG) for nested named entity recognition (NER)
We propose a novel bipartite flat-graph network (BiFlaG) for nested named entity recognition (NER)
arXiv Detail & Related papers (2020-05-01T15:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.