Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction
- URL: http://arxiv.org/abs/2502.13412v1
- Date: Wed, 19 Feb 2025 03:51:31 GMT
- Title: Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction
- Authors: Yanbang Sun, Qing Huang, Xiaoxue Ren, Zhenchang Xing, Xiaohong Li, Junjie Wang,
- Abstract summary: API Knowledge Graph (API KG) is a structured network that models API entities and their relations.<n>We propose an automated approach for API KG construction based on large language models (LLMs)<n>Our method surpasses the state-of-the-art method, achieving a 25.2% improvement in F1 score.
- Score: 16.67905950921807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The API Knowledge Graph (API KG) is a structured network that models API entities and their relations, providing essential semantic insights for tasks such as API recommendation, code generation, and API misuse detection. However, constructing a knowledge-rich and reliable API KG presents several challenges. Existing schema-based methods rely heavily on manual annotations to design KG schemas, leading to excessive manual overhead. On the other hand, schema-free methods, due to the lack of schema guidance, are prone to introducing noise, reducing the KG's reliability. To address these issues, we propose the Explore-Construct-Filter framework, an automated approach for API KG construction based on large language models (LLMs). This framework consists of three key modules: 1) KG exploration: LLMs simulate the workflow of annotators to automatically design a schema with comprehensive type triples, minimizing human intervention; 2) KG construction: Guided by the schema, LLMs extract instance triples to construct a rich yet unreliable API KG; 3) KG filtering: Removing invalid type triples and suspicious instance triples to construct a rich and reliable API KG. Experimental results demonstrate that our method surpasses the state-of-the-art method, achieving a 25.2% improvement in F1 score. Moreover, the Explore-Construct-Filter framework proves effective, with the KG exploration module increasing KG richness by 133.6% and the KG filtering module improving reliability by 26.6%. Finally, cross-model experiments confirm the generalizability of our framework.
Related papers
- Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models [10.130178524819536]
Knowledge graphs (KGs) provide structured, verifiable grounding for large language models (LLMs)<n>Current LLM-based systems commonly use KGs as auxiliary structures for text retrieval, leaving their intrinsic quality underexplored.<n>We propose Wikontic, a multi-stage pipeline that constructs KGs from open-domain text by extracting candidate triplets with qualifiers.
arXiv Detail & Related papers (2025-11-29T18:44:25Z) - Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need? [57.28763506780752]
GraphFlow is a framework that efficiently retrieves accurate and diverse knowledge required for real-world queries from text-rich KGs.<n>It outperforms strong KG-RAG baselines, including GPT-4o, by 10% on average in hit rate and recall.<n>It also shows strong generalization to unseen KGs, demonstrating its effectiveness and robustness.
arXiv Detail & Related papers (2025-10-18T17:06:49Z) - Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning [18.9814789695716]
Knowledge-graph retrieval-augmented generation (KG-RAG) couples large language models (LLMs) with structured, verifiable knowledge graphs (KGs) to reduce hallucinations and expose reasoning traces.<n>We introduce KG-R1, an agentic KG retrieval-augmented generation (KG-RAG) framework through reinforcement learning (RL)<n> KG-R1 utilizes a single agent that interacts with KGs as its environment, learning to retrieve at each step and incorporating the retrieved information into its reasoning and generation.
arXiv Detail & Related papers (2025-09-30T15:14:24Z) - Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching [61.824094419641575]
Large Language Models (LLMs) struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA)<n>We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures.<n>Existing methods usually employ resource-intensive, non-scalable reasoning on vanilla KGs, but overlook this gap.<n>We propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries.
arXiv Detail & Related papers (2025-09-25T06:48:52Z) - CORE-KG: An LLM-Driven Knowledge Graph Construction Framework for Human Smuggling Networks [9.68109098750283]
CORE-KG is a modular framework for building interpretable knowledge graphs from legal texts.<n>It reduces node duplication by 33.28%, and legal noise by 38.37% compared to a GraphRAG-based baseline.
arXiv Detail & Related papers (2025-06-20T11:58:00Z) - KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs [66.35046942874737]
KG-Infused RAG is a framework that integrates KGs into RAG systems to implement spreading activation.<n> KG-Infused RAG retrieves KG facts, expands the query accordingly, and enhances generation by combining corpus passages with structured facts.
arXiv Detail & Related papers (2025-06-11T09:20:02Z) - KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering [7.079181644378029]
KG2QA is a question answering framework that integrates fine-tuned large language models (LLMs) with a domain-specific knowledge graph (KG)<n>We construct a high-quality dataset of 6,587 QA pairs from ITU-T recommendations and fine-tune Qwen2.5-7B-Instruct.<n>In our KG-RAG pipeline, the fine-tuned LLMs first retrieves relevant knowledge from KG, enabling more accurate and factually grounded responses.
arXiv Detail & Related papers (2025-06-08T08:07:22Z) - Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models [17.88134311726175]
We propose a framework to learn and apply quantized codes for each entity, aiming for the seamless integration of Knowledge Graphs with Large Language Models.<n>Experiment results demonstrate that SSQR outperforms existing unsupervised quantized methods, producing more distinguishable codes.<n>The fine-tuned LLaMA2 and LLaMA3.1 also have superior performance on KG link prediction and triple classification tasks.
arXiv Detail & Related papers (2025-01-30T03:40:20Z) - KG-CF: Knowledge Graph Completion with Context Filtering under the Guidance of Large Language Models [55.39134076436266]
KG-CF is a framework tailored for ranking-based knowledge graph completion tasks.<n> KG-CF leverages LLMs' reasoning abilities to filter out irrelevant contexts, achieving superior results on real-world datasets.
arXiv Detail & Related papers (2025-01-06T01:52:15Z) - Efficient Relational Context Perception for Knowledge Graph Completion [25.903926643251076]
Knowledge Graphs (KGs) provide a structured representation of knowledge but often suffer from challenges of incompleteness.<n>Previous knowledge graph embedding models are limited in their ability to capture expressive features.<n>We propose Triple Receptance Perception architecture to model sequential information, enabling the learning of dynamic context.
arXiv Detail & Related papers (2024-12-31T11:25:58Z) - Can LLMs be Good Graph Judger for Knowledge Graph Construction? [33.958327252291]
In this paper, we propose GraphJudger, a knowledge graph construction framework to address the aforementioned challenges.
We introduce three innovative modules in our method, which are entity-centric iterative text denoising, knowledge aware instruction tuning and graph judgement.
Experiments conducted on two general text-graph pair datasets and one domain-specific text-graph pair dataset show superior performances compared to baseline methods.
arXiv Detail & Related papers (2024-11-26T12:46:57Z) - Retrieval, Reasoning, Re-ranking: A Context-Enriched Framework for Knowledge Graph Completion [36.664300900246424]
Existing embedding-based methods rely solely on triples in the Knowledge Graph.
We propose KGR3, a context-enriched framework for KGC.
Experiments on widely used datasets demonstrate that KGR3 consistently improves various KGC methods.
arXiv Detail & Related papers (2024-11-12T20:15:58Z) - Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective [13.905336639352404]
This work introduces Graphusion, a zero-shot Knowledge Graph framework from free text.<n>It contains three steps: in Step 1, we extract a list of seed entities using topic modeling to guide the final KG includes the most relevant entities.<n>In Step 2, we conduct candidate triplet extraction using LLMs; in Step 3, we design the novel fusion module that provides a global view of the extracted knowledge.
arXiv Detail & Related papers (2024-10-23T06:54:03Z) - Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency [59.6772484292295]
Knowledge graphs (KGs) generated by large language models (LLMs) are increasingly valuable for Retrieval-Augmented Generation (RAG) applications.
Existing KG extraction methods rely on prompt-based approaches, which are inefficient for processing large-scale corpora.
We propose SynthKG, a multi-step, document-level synthesis KG workflow based on LLMs.
We also design a novel graph-based retrieval framework for RAG.
arXiv Detail & Related papers (2024-10-22T00:47:54Z) - Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs)
GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z) - KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning
over Knowledge Graph [134.8631016845467]
We propose an autonomous LLM-based agent framework, called KG-Agent.
In KG-Agent, we integrate the LLM, multifunctional toolbox, KG-based executor, and knowledge memory.
To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG.
arXiv Detail & Related papers (2024-02-17T02:07:49Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from
Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs.
With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge.
We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z) - Toward Subgraph-Guided Knowledge Graph Question Generation with Graph
Neural Networks [53.58077686470096]
Knowledge graph (KG) question generation (QG) aims to generate natural language questions from KGs and target answers.
In this work, we focus on a more realistic setting where we aim to generate questions from a KG subgraph and target answers.
arXiv Detail & Related papers (2020-04-13T15:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.