Entity Linking and Discovery via Arborescence-based Supervised
Clustering
- URL: http://arxiv.org/abs/2109.01242v1
- Date: Thu, 2 Sep 2021 23:05:58 GMT
- Title: Entity Linking and Discovery via Arborescence-based Supervised
Clustering
- Authors: Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum
- Abstract summary: We present novel training and inference procedures that fully utilize mention-to-mention affinities.
We show that this method gracefully extends to entity discovery.
We evaluate our approach on the Zero-Shot Entity Linking dataset and MedMentions, the largest publicly available biomedical dataset.
- Score: 35.93568319872986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work has shown promising results in performing entity linking by
measuring not only the affinities between mentions and entities but also those
amongst mentions. In this paper, we present novel training and inference
procedures that fully utilize mention-to-mention affinities by building minimum
arborescences (i.e., directed spanning trees) over mentions and entities across
documents in order to make linking decisions. We also show that this method
gracefully extends to entity discovery, enabling the clustering of mentions
that do not have an associated entity in the knowledge base. We evaluate our
approach on the Zero-Shot Entity Linking dataset and MedMentions, the largest
publicly available biomedical dataset, and show significant improvements in
performance for both entity linking and discovery compared to identically
parameterized models. We further show significant efficiency improvements with
only a small loss in accuracy over previous work, which use more
computationally expensive models.
Related papers
- Entity Disambiguation via Fusion Entity Decoding [68.77265315142296]
We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions.
We observe +1.5% improvements in end-to-end entity linking in the GERBIL benchmark compared with EntQA.
arXiv Detail & Related papers (2024-04-02T04:27:54Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Effective Few-Shot Named Entity Linking by Meta-Learning [34.70028855572534]
We propose a novel weak supervision strategy to generate non-trivial synthetic entity-mention pairs.
We also design a meta-learning mechanism to assign different weights to each synthetic entity-mention pair automatically.
Experiments on real-world datasets show that the proposed method can extensively improve the state-of-the-art few-shot entity linking model.
arXiv Detail & Related papers (2022-07-12T03:23:02Z) - PIE: a Parameter and Inference Efficient Solution for Large Scale
Knowledge Graph Embedding Reasoning [24.29409958504209]
We propose PIE, a textbfparameter and textbfinference textbfefficient solution.
Inspired from tensor decomposition methods, we find that decompose entity embedding matrix into low rank matrices can reduce more than half of the parameters.
To accelerate model inference, we propose a self-supervised auxiliary task, which can be seen as fine-grained entity typing.
arXiv Detail & Related papers (2022-04-29T09:06:56Z) - Learning to Select the Next Reasonable Mention for Entity Linking [39.112602039647896]
We propose a novel model, called DyMen, to dynamically adjust the subsequent linking target based on the previously linked entities.
We sample mention by sliding window to reduce the action sampling space of reinforcement learning and maintain the semantic coherence of mention.
arXiv Detail & Related papers (2021-12-08T04:12:50Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Clustering-based Inference for Biomedical Entity Linking [40.78384867437563]
We introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions.
In experiments on the largest publicly available biomedical dataset, we improve the best independent prediction for entity linking by 3.0 points of accuracy.
arXiv Detail & Related papers (2020-10-21T19:16:27Z) - Cross-Supervised Joint-Event-Extraction with Heterogeneous Information
Networks [61.950353376870154]
Joint-event-extraction is a sequence-to-sequence labeling task with a tag set composed of tags of triggers and entities.
We propose a Cross-Supervised Mechanism (CSM) to alternately supervise the extraction of triggers or entities.
Our approach outperforms the state-of-the-art methods in both entity and trigger extraction.
arXiv Detail & Related papers (2020-10-13T11:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.