Schema Matching on Graph: Iterative Graph Exploration for Efficient and Explainable Data Integration
- URL: http://arxiv.org/abs/2511.20285v2
- Date: Mon, 01 Dec 2025 06:02:11 GMT
- Title: Schema Matching on Graph: Iterative Graph Exploration for Efficient and Explainable Data Integration
- Authors: Mingyu Jeon, Jaeyoung Suh, Suwan Cho,
- Abstract summary: Large Language Models (LLMs) have shown promise in schema matching, but suffer from hallucination and lack of up-to-date domain knowledge.<n>This paper introduces SMoG ( Matching on Graph), a novel framework that leverages iterative execution of simple 1-hop SPARQL queries.<n> Experimental results on real-world medical datasets demonstrate that SMoG achieves performance comparable to state-of-the-art baselines.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Schema matching is a critical task in data integration, particularly in the medical domain where disparate Electronic Health Record (EHR) systems must be aligned to standard models like OMOP CDM. While Large Language Models (LLMs) have shown promise in schema matching, they suffer from hallucination and lack of up-to-date domain knowledge. Knowledge Graphs (KGs) offer a solution by providing structured, verifiable knowledge. However, existing KG-augmented LLM approaches often rely on inefficient complex multi-hop queries or storage-intensive vector-based retrieval methods. This paper introduces SMoG (Schema Matching on Graph), a novel framework that leverages iterative execution of simple 1-hop SPARQL queries, inspired by successful strategies in Knowledge Graph Question Answering (KGQA). SMoG enhances explainability and reliability by generating human-verifiable query paths while significantly reducing storage requirements by directly querying SPARQL endpoints. Experimental results on real-world medical datasets demonstrate that SMoG achieves performance comparable to state-of-the-art baselines, validating its effectiveness and efficiency in KG-augmented schema matching.
Related papers
- Graph-Anchored Knowledge Indexing for Retrieval-Augmented Generation [53.42323544075114]
We propose GraphAnchor, a novel Graph-Anchored Knowledge Indexing approach.<n> Experiments on four multi-hop question answering benchmarks demonstrate the effectiveness of GraphAnchor.
arXiv Detail & Related papers (2026-01-23T05:41:05Z) - GRIP: In-Parameter Graph Reasoning through Fine-Tuning Large Language Models [40.5886835740214]
Large Language Models (LLMs) have demonstrated remarkable capabilities in modeling sequential textual data.<n>We propose GRIP, a novel framework that equips LLMs with the ability to internalize complex relational information from graphs.<n>This knowledge is efficiently stored within lightweight LoRA parameters, enabling the fine-tuned LLM to perform a wide range of graph-related tasks.
arXiv Detail & Related papers (2025-11-06T21:56:58Z) - Search-on-Graph: Iterative Informed Navigation for Large Language Model Reasoning on Knowledge Graphs [26.0585592684229]
Large language models (LLMs) have demonstrated impressive reasoning abilities yet remain unreliable on knowledge-intensive, multi-hop questions.<n>We propose Search-on-Graph (SoG), a simple yet effective framework that enables LLMs to perform iterative informed graph navigation.<n>We demonstrate particularly strong gains on Wikidata benchmarks (+16% improvement over previous best methods) alongside consistent improvements on Freebase benchmarks.
arXiv Detail & Related papers (2025-10-09T21:20:16Z) - G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge [88.82814893945077]
Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge.<n>Recent graph-enhanced RAG (GraphRAG) attempts to bridge this gap by constructing tailored graphs and enabling LLMs to reason on them.<n>G-reasoner is a unified framework that integrates graph and language foundation models for reasoning over diverse graph-structured knowledge.
arXiv Detail & Related papers (2025-09-29T04:38:12Z) - Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching [61.824094419641575]
Large Language Models (LLMs) struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA)<n>We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures.<n>Existing methods usually employ resource-intensive, non-scalable reasoning on vanilla KGs, but overlook this gap.<n>We propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries.
arXiv Detail & Related papers (2025-09-25T06:48:52Z) - GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [59.72897499248909]
We propose a novel graph retriever trained end-to-end with Large Language Models (LLMs)<n>Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together.<n>Our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks.
arXiv Detail & Related papers (2025-09-20T02:38:00Z) - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - Knowledge Graph-based Retrieval-Augmented Generation for Schema Matching [3.7548609506798485]
We propose a Knowledge Graph-based Retrieval-Augmented Generation model for large language models (LLMs) matching.<n>In particular, KG-RAG4SM introduces novel vector-based, graph-based, and query-based graph retrievals.<n>We show that KG-RAG4SM outperforms the state-of-the-art (SOTA) methods by 35.89% and 30.50% in terms of precision and F1 score on the MIMIC dataset.
arXiv Detail & Related papers (2025-01-15T09:32:37Z) - Matchmaker: Self-Improving Large Language Model Programs for Schema Matching [60.23571456538149]
We propose a compositional language model program for schema matching, comprised of candidate generation, refinement and confidence scoring.
Matchmaker self-improves in a zero-shot manner without the need for labeled demonstrations.
Empirically, we demonstrate on real-world medical schema matching benchmarks that Matchmaker outperforms previous ML-based approaches.
arXiv Detail & Related papers (2024-10-31T16:34:03Z) - Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation [9.844598565914055]
Large Language Models (LLMs) demonstrate strong reasoning abilities but face limitations such as hallucinations and outdated knowledge.<n>We introduce SubgraphRAG, extending the Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) framework that retrieves subgraphs.<n>Our approach innovatively integrates a lightweight multilayer perceptron with a parallel triple-scoring mechanism for efficient and flexible subgraph retrieval.
arXiv Detail & Related papers (2024-10-28T04:39:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.