Related papers: Relevant Entity Selection: Knowledge Graph Bootstrapping via Zero-Shot Analogical Pruning

Relevant Entity Selection: Knowledge Graph Bootstrapping via Zero-Shot Analogical Pruning

URL: http://arxiv.org/abs/2306.16296v2
Date: Wed, 16 Aug 2023 09:28:17 GMT
Title: Relevant Entity Selection: Knowledge Graph Bootstrapping via Zero-Shot Analogical Pruning
Authors: Lucas Jarnac, Miguel Couceiro, Pierre Monnin
Abstract summary: We propose an analogy-based approach that starts from seed entities of interest in a generic KG, and keeps or prunes their neighboring entities. We evaluate our approach on Wikidata through two manually labeled datasets that contain either domain-homogeneous or -heterogeneous seed entities.
Score: 4.281723404774889
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge Graph Construction (KGC) can be seen as an iterative process starting from a high quality nucleus that is refined by knowledge extraction approaches in a virtuous loop. Such a nucleus can be obtained from knowledge existing in an open KG like Wikidata. However, due to the size of such generic KGs, integrating them as a whole may entail irrelevant content and scalability issues. We propose an analogy-based approach that starts from seed entities of interest in a generic KG, and keeps or prunes their neighboring entities. We evaluate our approach on Wikidata through two manually labeled datasets that contain either domain-homogeneous or -heterogeneous seed entities. We empirically show that our analogy-based approach outperforms LSTM, Random Forest, SVM, and MLP, with a drastically lower number of parameters. We also evaluate its generalization potential in a transfer learning setting. These results advocate for the further integration of analogy-based inference in tasks related to the KG lifecycle.

Related papers

Relation Extraction Across Entire Books to Reconstruct Community Networks: The AffilKG Datasets [3.9244082434642555]
AffilKG is a collection of six datasets that are the first to pair complete book scans with large, labeled knowledge graphs.<n>Each dataset features affiliation graphs, which are simple KGs that capture Member relationships between Person and Organization entities.
arXiv Detail & Related papers (2025-05-16T02:24:32Z)
Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema [60.42231674887294]
We propose an ontology-grounded approach to Knowledge Graph (KG) construction using Large Language Models (LLMs) on a knowledge base. We ground generation of KG with the authored ontology based on extracted relations to ensure consistency and interpretability. Our work presents a promising direction for scalable KG construction pipeline with minimal human intervention, that yields high quality and human-interpretable KGs.
arXiv Detail & Related papers (2024-12-30T13:36:05Z)
Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency [59.6772484292295]
Knowledge graphs (KGs) generated by large language models (LLMs) are increasingly valuable for Retrieval-Augmented Generation (RAG) applications. Existing KG extraction methods rely on prompt-based approaches, which are inefficient for processing large-scale corpora. We propose SynthKG, a multi-step, document-level synthesis KG workflow based on LLMs. We also design a novel graph-based retrieval framework for RAG.
arXiv Detail & Related papers (2024-10-22T00:47:54Z)
KGPrune: a Web Application to Extract Subgraphs of Interest from Wikidata with Analogical Pruning [3.250579305400297]
We introduce KGPrune, a Web Application that extracts subgraphs of interest from Wikidata. KGPrune relies on a frugal pruning algorithm based on analogical reasoning to only keep relevant neighbors while pruning irrelevant ones. The interest of KGPrune is illustrated by two concrete applications, namely, bootstrapping an enterprise KG and extracting knowledge related to looted artworks.
arXiv Detail & Related papers (2024-08-26T21:47:49Z)
Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs [72.89652710634051]
Knowledge graphs (KGs) complement Large Language Models (LLMs) by providing reliable, structured, domain-specific, and up-to-date external knowledge. We introduce Tree-of-Traversals, a novel zero-shot reasoning algorithm that enables augmentation of black-box LLMs with one or more KGs.
arXiv Detail & Related papers (2024-07-31T06:01:24Z)
Wiki Entity Summarization Benchmark [9.25319552487389]
Entity summarization aims to compute concise summaries for entities in knowledge graphs. Existing datasets and benchmarks are often limited to a few hundred entities. We propose WikES, a comprehensive benchmark comprising of entities, their summaries, and their connections.
arXiv Detail & Related papers (2024-06-12T17:22:00Z)
Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs) GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z)
Natural Language Processing for Drug Discovery Knowledge Graphs: promises and pitfalls [0.0]
Building and analysing knowledge graphs (KGs) to aid drug discovery is a topical area of research. We discuss promises and pitfalls of using natural language processing (NLP) to mine unstructured text as a data source for KGs.
arXiv Detail & Related papers (2023-10-24T07:35:24Z)
PyGraft: Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips [3.5923669681271257]
PyGraft is a Python-based tool that generates customized, domain-agnostic schemas and KGs. We aim to empower the generation of a more diverse array of KGs for benchmarking novel approaches in areas such as graph-based machine learning (ML) In ML, this should foster a more holistic evaluation of model performance and generalization capability, thereby going beyond the limited collection of available benchmarks.
arXiv Detail & Related papers (2023-09-07T13:00:09Z)
Interactive Segmentation as Gaussian Process Classification [58.44673380545409]
Click-based interactive segmentation (IS) aims to extract the target objects under user interaction. Most of the current deep learning (DL)-based methods mainly follow the general pipelines of semantic segmentation. We propose to formulate the IS task as a Gaussian process (GP)-based pixel-wise binary classification model on each image.
arXiv Detail & Related papers (2023-02-28T14:01:01Z)
BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs. With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge. We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z)
OntoMerger: An Ontology Integration Library for Deduplicating and Connecting Knowledge Graph Nodes [2.6553713413568913]
OntoMerger is a Python integration library whose functionality is to deduplicate KG nodes. Our approach takes a set of KG nodes, mappings and disconnected and generates a set of merged nodes together with a connected hierarchy. OntoMerger can be applied to a wide variety of KGs.
arXiv Detail & Related papers (2022-06-05T18:52:26Z)
Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering [50.72473345911147]
This paper augments a general commonsense QA framework with a knowledgeable path generator. By extrapolating over existing paths in a KG with a state-of-the-art language model, our generator learns to connect a pair of entities in text with a dynamic, and potentially novel, multi-hop relational path.
arXiv Detail & Related papers (2020-05-02T03:53:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.