A Practical Entity Linking System for Tables in Scientific Literature
- URL: http://arxiv.org/abs/2306.10044v1
- Date: Mon, 12 Jun 2023 01:40:57 GMT
- Title: A Practical Entity Linking System for Tables in Scientific Literature
- Authors: Varish Mulwad, Tim Finin, Vijay S. Kumar, Jenny Weisenberg Williams,
Sharad Dixit, and Anupam Joshi
- Abstract summary: This paper introduces a general-purpose system for linking entities to items in the Wikidata knowledge base.
It describes how we adapt this system for linking domain-specific entities, especially for those entities embedded within tables drawn from COVID-19-related scientific literature.
- Score: 2.093510158982825
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Entity linking is an important step towards constructing knowledge graphs
that facilitate advanced question answering over scientific documents,
including the retrieval of relevant information included in tables within these
documents. This paper introduces a general-purpose system for linking entities
to items in the Wikidata knowledge base. It describes how we adapt this system
for linking domain-specific entities, especially for those entities embedded
within tables drawn from COVID-19-related scientific literature. We describe
the setup of an efficient offline instance of the system that enables our
entity-linking approach to be more feasible in practice. As part of a broader
approach to infer the semantic meaning of scientific tables, we leverage the
structural and semantic characteristics of the tables to improve overall entity
linking performance.
Related papers
- Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval [49.42043077545341]
We propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG)
We leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR)
arXiv Detail & Related papers (2024-10-17T17:03:23Z) - A Semantic Mention Graph Augmented Model for Document-Level Event Argument Extraction [12.286432133599355]
Document-level Event Argument Extraction (DEAE) aims to identify arguments and their specific roles from an unstructured document.
advanced approaches on DEAE utilize prompt-based methods to guide pre-trained language models (PLMs) in extracting arguments from input documents.
We propose a semantic mention Graph Augmented Model (GAM) to address these two problems in this paper.
arXiv Detail & Related papers (2024-03-12T08:58:07Z) - Improving Retrieval in Theme-specific Applications using a Corpus
Topical Taxonomy [52.426623750562335]
We introduce ToTER (Topical taxonomy Enhanced Retrieval) framework.
ToTER identifies the central topics of queries and documents with the guidance of the taxonomy, and exploits their topical relatedness to supplement missing contexts.
As a plug-and-play framework, ToTER can be flexibly employed to enhance various PLM-based retrievers.
arXiv Detail & Related papers (2024-03-07T02:34:54Z) - Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems [58.561904356651276]
We introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework to improve the semantic understanding of entities for Conversational recommender systems.
KERL uses a knowledge graph and a pre-trained language model to improve the semantic understanding of entities.
KERL achieves state-of-the-art results in both recommendation and response generation tasks.
arXiv Detail & Related papers (2023-12-18T06:41:23Z) - DiscoverPath: A Knowledge Refinement and Retrieval System for
Interdisciplinarity on Biomedical Research [96.10765714077208]
Traditional keyword-based search engines fall short in assisting users who may not be familiar with specific terminologies.
We present a knowledge graph-based paper search engine for biomedical research to enhance the user experience.
The system, dubbed DiscoverPath, employs Named Entity Recognition (NER) and part-of-speech (POS) tagging to extract terminologies and relationships from article abstracts to create a KG.
arXiv Detail & Related papers (2023-09-04T20:52:33Z) - DocTr: Document Transformer for Structured Information Extraction in
Documents [36.1145541816468]
We present a new formulation for structured information extraction from visually rich documents.
It aims to address the limitations of existing IOB tagging or graph-based formulations.
We represent an entity as an anchor word and a bounding box, and represent entity linking as the association between anchor words.
arXiv Detail & Related papers (2023-07-16T02:59:30Z) - CateCom: a practical data-centric approach to categorization of
computational models [77.34726150561087]
We present an effort aimed at organizing the landscape of physics-based and data-driven computational models.
We apply object-oriented design concepts and outline the foundations of an open-source collaborative framework.
arXiv Detail & Related papers (2021-09-28T02:59:40Z) - Tab2Know: Building a Knowledge Base from Tables in Scientific Papers [6.514665180383298]
We present Tab2Know, a new end-to-end system to build a Knowledge Base from tables in scientific papers.
We propose a pipeline that employs both statistical-based classifiers and logic-based reasoning.
An empirical evaluation of our approach using a corpus of papers in the Computer Science domain has returned satisfactory performance.
arXiv Detail & Related papers (2021-07-28T11:56:53Z) - End-to-End Hierarchical Relation Extraction for Generic Form
Understanding [0.6299766708197884]
We present a novel deep neural network to jointly perform both entity detection and link prediction.
Our model extends the Multi-stage Attentional U-Net architecture with the Part-Intensity Fields and Part-Association Fields for link prediction.
We demonstrate the effectiveness of the model on the Form Understanding in Noisy Scanned Documents dataset.
arXiv Detail & Related papers (2021-06-02T06:51:35Z) - A Graph Representation of Semi-structured Data for Web Question
Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations.
Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.