$R^3$-NL2GQL: A Model Coordination and Knowledge Graph Alignment Approach for NL2GQL
- URL: http://arxiv.org/abs/2311.01862v2
- Date: Mon, 1 Jul 2024 14:59:58 GMT
- Title: $R^3$-NL2GQL: A Model Coordination and Knowledge Graph Alignment Approach for NL2GQL
- Authors: Yuhang Zhou, Yu He, Siyu Tian, Yuchen Ni, Zhangyue Yin, Xiang Liu, Chuanjun Ji, Sen Liu, Xipeng Qiu, Guangnan Ye, Hongfeng Chai,
- Abstract summary: We introduce a novel approach, $R3$-NL2GQL, integrating both small and large Foundation Models for ranking, rewriting, and refining tasks.
We have developed a bilingual dataset, sourced from graph database manuals and selected open-source Knowledge Graphs (KGs)
- Score: 45.13624736815995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While current tasks of converting natural language to SQL (NL2SQL) using Foundation Models have shown impressive achievements, adapting these approaches for converting natural language to Graph Query Language (NL2GQL) encounters hurdles due to the distinct nature of GQL compared to SQL, alongside the diverse forms of GQL. Moving away from traditional rule-based and slot-filling methodologies, we introduce a novel approach, $R^3$-NL2GQL, integrating both small and large Foundation Models for ranking, rewriting, and refining tasks. This method leverages the interpretative strengths of smaller models for initial ranking and rewriting stages, while capitalizing on the superior generalization and query generation prowess of larger models for the final transformation of natural language queries into GQL formats. Addressing the scarcity of datasets in this emerging field, we have developed a bilingual dataset, sourced from graph database manuals and selected open-source Knowledge Graphs (KGs). Our evaluation of this methodology on this dataset demonstrates its promising efficacy and robustness.
Related papers
- Multi-turn Natural Language to Graph Query Language Translation [15.249580032219336]
In practical applications, user interactions with graph databases are typically multi-turn, dynamic, and context-dependent.<n>Research focused on single-turn conversion fails to effectively address multi-turn dialogues and complex context dependencies.<n>We propose an automated method for constructing multi-turn NL2GQL datasets based on Large Language Models (LLMs)
arXiv Detail & Related papers (2025-08-03T17:56:52Z) - Text-to-SPARQL Goes Beyond English: Multilingual Question Answering Over Knowledge Graphs through Human-Inspired Reasoning [51.203811759364925]
mKGQAgent breaks down the task of converting natural language questions into SPARQL queries into modular, interpretable subtasks.<n> Evaluated on the DBpedia- and Corporate-based KGQA benchmarks within the Text2SPARQL challenge 2025, our approach took first place among the other participants.
arXiv Detail & Related papers (2025-07-22T19:23:03Z) - GRASP: Generic Reasoning And SPARQL Generation across Knowledge Graphs [4.005483185111992]
We propose a new approach for generating SPARQL queries on RDF knowledge graphs from natural language questions or keyword queries.<n>Our approach does not require fine-tuning. Instead, it uses the language model to explore the knowledge graph by strategically executing SPARQL queries and searching for relevant IRIs and literals.
arXiv Detail & Related papers (2025-07-10T18:50:05Z) - NAT-NL2GQL: A Novel Multi-Agent Framework for Translating Natural Language to Graph Query Language [13.661054027428868]
We propose NAT-NL2GQL, a novel framework for translating natural language to graph query language.
Our framework consists of three synergistic agents: the Preprocessor agent, the Generator agent, and the Refiner agent.
Given the scarcity of high-quality open-source NL2GQL datasets based on nGQL syntax, we developed StockGQL, a dataset constructed from a financial market graph database.
arXiv Detail & Related papers (2024-12-11T04:14:09Z) - Towards Evaluating Large Language Models for Graph Query Generation [49.49881799107061]
Large Language Models (LLMs) are revolutionizing the landscape of Generative Artificial Intelligence (GenAI)
This paper presents a comparative study addressing the challenge of generating queries a powerful language for interacting with graph databases using open-access LLMs.
Our empirical analysis of query generation accuracy reveals that Claude Sonnet 3.5 outperforms its counterparts in this specific domain.
arXiv Detail & Related papers (2024-11-13T09:11:56Z) - Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries.
To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z) - Aligning Large Language Models to a Domain-specific Graph Database for NL2GQL [16.637504932927616]
We present a well-defined pipeline for NL2GQL tasks tailored to a particular domain.
We employ ChatGPT to generate NLGQL data pairs, leveraging the provided graph DB with self-instruction.
We then employ the generated data to fine-tune LLMs, ensuring alignment between LLMs and the graph DB.
arXiv Detail & Related papers (2024-02-26T13:46:51Z) - Generative Language Models for Paragraph-Level Question Generation [79.31199020420827]
Powerful generative models have led to recent progress in question generation (QG)
It is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches.
We introduce QG-Bench, a benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting.
arXiv Detail & Related papers (2022-10-08T10:24:39Z) - An Inference Approach To Question Answering Over Knowledge Graphs [7.989723691844202]
We convert the problem of natural language querying over knowledge graphs to an inference problem over premise-hypothesis pairs.
Our method achieves over 90% accuracy on MetaQA dataset, beating the existing state-of-the-art.
Our approach does not require large domain-specific training data for querying on new knowledge graphs from different domains.
arXiv Detail & Related papers (2021-12-21T10:07:55Z) - Learning Contextual Representations for Semantic Parsing with
Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.
Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z) - ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries [10.273545005890496]
We introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL)
ColloQL achieves 84.9% (execution) and 90.7% (execution) accuracy on the Wikilogical dataset.
arXiv Detail & Related papers (2020-10-19T23:53:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.