Clustering-based Inference for Biomedical Entity Linking
- URL: http://arxiv.org/abs/2010.11253v2
- Date: Thu, 8 Apr 2021 19:21:58 GMT
- Title: Clustering-based Inference for Biomedical Entity Linking
- Authors: Rico Angell, Nicholas Monath, Sunil Mohan, Nishant Yadav and Andrew
McCallum
- Abstract summary: We introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions.
In experiments on the largest publicly available biomedical dataset, we improve the best independent prediction for entity linking by 3.0 points of accuracy.
- Score: 40.78384867437563
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to large number of entities in biomedical knowledge bases, only a small
fraction of entities have corresponding labelled training data. This
necessitates entity linking models which are able to link mentions of unseen
entities using learned representations of entities. Previous approaches link
each mention independently, ignoring the relationships within and across
documents between the entity mentions. These relations can be very useful for
linking mentions in biomedical text where linking decisions are often difficult
due mentions having a generic or a highly specialized form. In this paper, we
introduce a model in which linking decisions can be made not merely by linking
to a knowledge base entity but also by grouping multiple mentions together via
clustering and jointly making linking predictions. In experiments on the
largest publicly available biomedical dataset, we improve the best independent
prediction for entity linking by 3.0 points of accuracy, and our
clustering-based inference model further improves entity linking by 2.3 points.
Related papers
- OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting [49.655711022673046]
OneNet is an innovative framework that utilizes the few-shot learning capabilities of Large Language Models (LLMs) without the need for fine-tuning.
OneNet is structured around three key components prompted by LLMs: (1) an entity reduction processor that simplifies inputs by summarizing and filtering out irrelevant entities, (2) a dual-perspective entity linker that combines contextual cues and prior knowledge for precise entity linking, and (3) an entity consensus judger that employs a unique consistency algorithm to alleviate the hallucination in the entity linking reasoning.
arXiv Detail & Related papers (2024-10-10T02:45:23Z) - Biomedical Entity Linking as Multiple Choice Question Answering [48.74212158495695]
We present BioELQA, a novel model that treats Biomedical Entity Linking as Multiple Choice Question Answering.
We first obtains candidate entities with a fast retriever, jointly presents the mention and candidate entities to a generator, and then outputs the predicted symbol associated with its chosen entity.
To improve generalization for long-tailed entities, we retrieve similar labeled training instances as clues and the input with retrieved instances for the generator.
arXiv Detail & Related papers (2024-02-23T08:40:38Z) - Learning Relation-Specific Representations for Few-shot Knowledge Graph
Completion [24.880078645503417]
We propose a Relation-Specific Context Learning framework, which exploits graph contexts of triples to capture semantic information of relations and entities simultaneously.
Experimental results on two public datasets demonstrate that RSCL outperforms state-of-the-art FKGC methods.
arXiv Detail & Related papers (2022-03-22T11:45:48Z) - Knowledge-Rich Self-Supervised Entity Linking [58.838404666183656]
Knowledge-RIch Self-Supervision ($tt KRISSBERT$) is a universal entity linker for four million UMLS entities.
Our approach subsumes zero-shot and few-shot methods, and can easily incorporate entity descriptions and gold mention labels if available.
Without using any labeled information, our method produces $tt KRISSBERT$, a universal entity linker for four million UMLS entities.
arXiv Detail & Related papers (2021-12-15T05:05:12Z) - Learning to Select the Next Reasonable Mention for Entity Linking [39.112602039647896]
We propose a novel model, called DyMen, to dynamically adjust the subsequent linking target based on the previously linked entities.
We sample mention by sliding window to reduce the action sampling space of reinforcement learning and maintain the semantic coherence of mention.
arXiv Detail & Related papers (2021-12-08T04:12:50Z) - Entity Linking and Discovery via Arborescence-based Supervised
Clustering [35.93568319872986]
We present novel training and inference procedures that fully utilize mention-to-mention affinities.
We show that this method gracefully extends to entity discovery.
We evaluate our approach on the Zero-Shot Entity Linking dataset and MedMentions, the largest publicly available biomedical dataset.
arXiv Detail & Related papers (2021-09-02T23:05:58Z) - Fast and Effective Biomedical Entity Linking Using a Dual Encoder [48.86736921025866]
We propose a BERT-based dual encoder model that resolves multiple mentions in a document in one shot.
We show that our proposed model is multiple times faster than existing BERT-based models while being competitive in accuracy for biomedical entity linking.
arXiv Detail & Related papers (2021-03-08T19:32:28Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z) - Learning Informative Representations of Biomedical Relations with Latent
Variable Models [2.4366811507669115]
We propose a latent variable model with an arbitrarily flexible distribution to represent the relation between an entity pair.
We demonstrate that our model achieves results competitive with strong baselines for both tasks while having fewer parameters and being significantly faster to train.
arXiv Detail & Related papers (2020-11-20T08:56:31Z) - Improving Broad-Coverage Medical Entity Linking with Semantic Type
Prediction and Large-Scale Datasets [12.131050765159145]
MedType is a fully modular system that prunes out irrelevant candidate concepts based on the predicted semantic type of an entity mention.
We present WikiMed and PubMedDS, two large-scale medical entity linking datasets, and demonstrate that pre-training MedType on these datasets further improves entity linking performance.
arXiv Detail & Related papers (2020-05-01T15:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.