Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity
Linking
- URL: http://arxiv.org/abs/2302.07189v4
- Date: Fri, 1 Sep 2023 20:32:05 GMT
- Title: Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity
Linking
- Authors: Hang Dong, Jiaoyan Chen, Yuan He, Yinan Liu, Ian Horrocks
- Abstract summary: We propose a new BERT-based Entity Linking (EL) method which can identify mentions that do not have corresponding KB entities by matching them to a NIL entity.
Results on five datasets show the advantages of BLINKout over existing methods to identify out-of-KB mentions.
- Score: 23.01938139604297
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discovering entity mentions that are out of a Knowledge Base (KB) from texts
plays a critical role in KB maintenance, but has not yet been fully explored.
The current methods are mostly limited to the simple threshold-based approach
and feature-based classification, and the datasets for evaluation are
relatively rare. We propose BLINKout, a new BERT-based Entity Linking (EL)
method which can identify mentions that do not have corresponding KB entities
by matching them to a special NIL entity. To better utilize BERT, we propose
new techniques including NIL entity representation and classification, with
synonym enhancement. We also apply KB Pruning and Versioning strategies to
automatically construct out-of-KB datasets from common in-KB EL datasets.
Results on five datasets of clinical notes, biomedical publications, and
Wikipedia articles in various domains show the advantages of BLINKout over
existing methods to identify out-of-KB mentions for the medical ontologies,
UMLS, SNOMED CT, and the general KB, WikiData.
Related papers
- UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Mapping and Cleaning Open Commonsense Knowledge Bases with Generative
Translation [14.678465723838599]
In particular, open information extraction (OpenIE) is often used to induce structure from a text.
OpenIEs contain an open-ended, non-canonicalized set of relations, making the extracted knowledge's downstream exploitation harder.
We propose approaching the problem by generative translation, i.e., by training a language model to generate fixed- assertions from open ones.
arXiv Detail & Related papers (2023-06-22T09:42:54Z) - Exploring Partial Knowledge Base Inference in Biomedical Entity Linking [0.4798394926736971]
We name this scenario partial knowledge base inference.
We construct benchmarks and witness a catastrophic degradation in EL performance due to dramatically precision drop.
We propose two simple-and-effective redemption methods to combat the NIL issue with little computational overhead.
arXiv Detail & Related papers (2023-03-18T04:31:07Z) - QA Is the New KR: Question-Answer Pairs as Knowledge Bases [105.692569000534]
We argue that the proposed type of KB has many of the key advantages of a traditional symbolic KB.
Unlike a traditional KB, this information store is well-aligned with common user information needs.
arXiv Detail & Related papers (2022-07-01T19:09:08Z) - Named Entity Linking on Namesakes [10.609815608017065]
We represent knowledge base (KB) entity by a set of embeddings.
We show that representations of entities in the knowledge base (KB) can be adjusted using only KB data, and the adjustment improves NEL performance.
arXiv Detail & Related papers (2022-05-21T03:31:25Z) - Knowledge-Rich Self-Supervised Entity Linking [58.838404666183656]
Knowledge-RIch Self-Supervision ($tt KRISSBERT$) is a universal entity linker for four million UMLS entities.
Our approach subsumes zero-shot and few-shot methods, and can easily incorporate entity descriptions and gold mention labels if available.
Without using any labeled information, our method produces $tt KRISSBERT$, a universal entity linker for four million UMLS entities.
arXiv Detail & Related papers (2021-12-15T05:05:12Z) - Reasoning Over Virtual Knowledge Bases With Open Predicate Relations [85.19305347984515]
We present the Open Predicate Query Language (OPQL)
OPQL is a method for constructing a virtual Knowledge Base (VKB) trained entirely from text.
We demonstrate that OPQL outperforms prior VKB methods on two different KB reasoning tasks.
arXiv Detail & Related papers (2021-02-14T01:29:54Z) - Probabilistic Case-based Reasoning for Open-World Knowledge Graph
Completion [59.549664231655726]
A case-based reasoning (CBR) system solves a new problem by retrieving cases' that are similar to the given problem.
In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs)
Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB.
arXiv Detail & Related papers (2020-10-07T17:48:12Z) - Learning Knowledge Bases with Parameters for Task-Oriented Dialogue
Systems [79.02430277138801]
The knowledge base (KB) plays an essential role in fulfilling user requests.
End-to-end systems use the KB directly as input, but they cannot scale when the KB is larger than a few hundred entries.
We propose a method to embed the KB, of any size, directly into the model parameters.
arXiv Detail & Related papers (2020-09-28T22:13:54Z) - Distantly-Supervised Neural Relation Extraction with Side Information
using BERT [2.0946724304757955]
Relation extraction (RE) consists in categorizing the relationship between entities in a sentence.
One of the methods that adopt this strategy is the RESIDE model, which proposes a distantly-supervised neural relation extraction using side information from Knowledge Bases.
Considering that this method outperformed state-of-the-art baselines, in this paper, we propose a related approach to RESIDE also using additional side information, but simplifying the sentence encoding with BERT embeddings.
arXiv Detail & Related papers (2020-04-29T19:29:10Z) - Novel Entity Discovery from Web Tables [21.16349961050804]
We leverage tables on the Web to discover new entities, properties, and relationships.
Our method identifies not only out-of-KB (novel'') information but also novel aliases for in-KB (known'') entities.
arXiv Detail & Related papers (2020-02-01T13:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.