Partial Annotation Learning for Biomedical Entity Recognition
- URL: http://arxiv.org/abs/2305.13120v1
- Date: Mon, 22 May 2023 15:18:38 GMT
- Title: Partial Annotation Learning for Biomedical Entity Recognition
- Authors: Liangping Ding, Giovanni Colavizza, Zhixiong Zhang
- Abstract summary: We show that partial annotation learning methods can effectively learn from biomedical corpora with missing entity annotations.
Our proposed model outperforms alternatives and, specifically, the PubMedBERT tagger by 38% in F1-score under high missing entity rates.
- Score: 0.19336815376402716
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Motivation: Named Entity Recognition (NER) is a key task to support
biomedical research. In Biomedical Named Entity Recognition (BioNER), obtaining
high-quality expert annotated data is laborious and expensive, leading to the
development of automatic approaches such as distant supervision. However,
manually and automatically generated data often suffer from the unlabeled
entity problem, whereby many entity annotations are missing, degrading the
performance of full annotation NER models. Results: To address this problem, we
systematically study the effectiveness of partial annotation learning methods
for biomedical entity recognition over different simulated scenarios of missing
entity annotations. Furthermore, we propose a TS-PubMedBERT-Partial-CRF partial
annotation learning model. We harmonize 15 biomedical NER corpora encompassing
five entity types to serve as a gold standard and compare against two commonly
used partial annotation learning models, BiLSTM-Partial-CRF and EER-PubMedBERT,
and the state-of-the-art full annotation learning BioNER model PubMedBERT
tagger. Results show that partial annotation learning-based methods can
effectively learn from biomedical corpora with missing entity annotations. Our
proposed model outperforms alternatives and, specifically, the PubMedBERT
tagger by 38% in F1-score under high missing entity rates. The recall of entity
mentions in our model is also competitive with the upper bound on the fully
annotated dataset.
Related papers
- EMBRE: Entity-aware Masking for Biomedical Relation Extraction [12.821610050561256]
We introduce the Entity-aware Masking for Biomedical Relation Extraction (EMBRE) method for relation extraction.
Specifically, we integrate entity knowledge into a deep neural network by pretraining the backbone model with an entity masking objective.
arXiv Detail & Related papers (2024-01-15T18:12:01Z) - From Zero to Hero: Harnessing Transformers for Biomedical Named Entity Recognition in Zero- and Few-shot Contexts [0.0]
This paper proposes a method for zero- and few-shot NER in the biomedical domain.
We have achieved average F1 scores of 35.44% for zero-shot NER, 50.10% for one-shot NER, 69.94% for 10-shot NER, and 79.51% for 100-shot NER on 9 diverse evaluated biomedical entities.
arXiv Detail & Related papers (2023-05-05T12:14:22Z) - AIONER: All-in-one scheme-based biomedical named entity recognition
using deep learning [7.427654811697884]
We present AIONER, a general-purpose BioNER tool based on cutting-edge deep learning and our AIO schema.
AIONER is effective, robust, and compares favorably to other state-of-the-art approaches such as multi-task learning.
arXiv Detail & Related papers (2022-11-30T12:35:00Z) - Nested Named Entity Recognition from Medical Texts: An Adaptive Shared
Network Architecture with Attentive CRF [53.55504611255664]
We propose a novel method, referred to as ASAC, to solve the dilemma caused by the nested phenomenon.
The proposed method contains two key modules: the adaptive shared (AS) part and the attentive conditional random field (ACRF) module.
Our model could learn better entity representations by capturing the implicit distinctions and relationships between different categories of entities.
arXiv Detail & Related papers (2022-11-09T09:23:56Z) - Knowledge-Rich Self-Supervised Entity Linking [58.838404666183656]
Knowledge-RIch Self-Supervision ($tt KRISSBERT$) is a universal entity linker for four million UMLS entities.
Our approach subsumes zero-shot and few-shot methods, and can easily incorporate entity descriptions and gold mention labels if available.
Without using any labeled information, our method produces $tt KRISSBERT$, a universal entity linker for four million UMLS entities.
arXiv Detail & Related papers (2021-12-15T05:05:12Z) - Biomedical Interpretable Entity Representations [40.6095537182194]
Pre-trained language models induce dense entity representations that offer strong performance on entity-centric NLP tasks.
This can be a barrier to model uptake in important domains such as biomedicine.
We create a new entity type system and training set from a large corpus of biomedical texts.
arXiv Detail & Related papers (2021-06-17T13:52:10Z) - Fast and Effective Biomedical Entity Linking Using a Dual Encoder [48.86736921025866]
We propose a BERT-based dual encoder model that resolves multiple mentions in a document in one shot.
We show that our proposed model is multiple times faster than existing BERT-based models while being competitive in accuracy for biomedical entity linking.
arXiv Detail & Related papers (2021-03-08T19:32:28Z) - A Teacher-Student Framework for Semi-supervised Medical Image
Segmentation From Mixed Supervision [62.4773770041279]
We develop a semi-supervised learning framework based on a teacher-student fashion for organ and lesion segmentation.
We show our model is robust to the quality of bounding box and achieves comparable performance compared with full-supervised learning methods.
arXiv Detail & Related papers (2020-10-23T07:58:20Z) - Dual-Teacher: Integrating Intra-domain and Inter-domain Teachers for
Annotation-efficient Cardiac Segmentation [65.81546955181781]
We propose a novel semi-supervised domain adaptation approach, namely Dual-Teacher.
The student model learns the knowledge of unlabeled target data and labeled source data by two teacher models.
We demonstrate that our approach is able to concurrently utilize unlabeled data and cross-modality data with superior performance.
arXiv Detail & Related papers (2020-07-13T10:00:44Z) - ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised
Medical Image Segmentation [99.90263375737362]
We propose ATSO, an asynchronous version of teacher-student optimization.
ATSO partitions the unlabeled data into two subsets and alternately uses one subset to fine-tune the model and updates the label on the other subset.
We evaluate ATSO on two popular medical image segmentation datasets and show its superior performance in various semi-supervised settings.
arXiv Detail & Related papers (2020-06-24T04:05:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.