PGDA-KGQA: A Prompt-Guided Generative Framework with Multiple Data Augmentation Strategies for Knowledge Graph Question Answering
- URL: http://arxiv.org/abs/2506.09414v1
- Date: Wed, 11 Jun 2025 05:56:03 GMT
- Title: PGDA-KGQA: A Prompt-Guided Generative Framework with Multiple Data Augmentation Strategies for Knowledge Graph Question Answering
- Authors: Xiujun Zhou, Pingjian Zhang, Deyou Tang,
- Abstract summary: Knowledge Graph Question Answering (KGQA) is a crucial task in natural language processing.<n>We propose PGDA-KGQA, a prompt-guided generative framework with multiple data augmentation strategies for KGQA.<n> Experiments demonstrate that PGDA-KGQA outperforms state-of-the-art methods on standard KGQA.
- Score: 3.8623708225544755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge Graph Question Answering (KGQA) is a crucial task in natural language processing that requires reasoning over knowledge graphs (KGs) to answer natural language questions. Recent methods utilizing large language models (LLMs) have shown remarkable semantic parsing capabilities but are limited by the scarcity of diverse annotated data and multi-hop reasoning samples. Traditional data augmentation approaches are focus mainly on single-hop questions and prone to semantic distortion, while LLM-based methods primarily address semantic distortion but usually neglect multi-hop reasoning, thus limiting data diversity. The scarcity of multi-hop samples further weakens models' generalization. To address these issues, we propose PGDA-KGQA, a prompt-guided generative framework with multiple data augmentation strategies for KGQA. At its core, PGDA-KGQA employs a unified prompt-design paradigm: by crafting meticulously engineered prompts that integrate the provided textual content, it leverages LLMs to generate large-scale (question, logical form) pairs for model training. Specifically, PGDA-KGQA enriches its training set by: (1) generating single-hop pseudo questions to improve the alignment of question semantics with KG relations; (2) applying semantic-preserving question rewriting to improve robustness against linguistic variations; (3) employing answer-guided reverse path exploration to create realistic multi-hop questions. By adopting an augment-generate-retrieve semantic parsing pipeline, PGDA-KGQA utilizes the augmented data to enhance the accuracy of logical form generation and thus improve answer retrieval performance. Experiments demonstrate that outperforms state-of-the-art methods on standard KGQA datasets, achieving improvements on WebQSP by 2.8%, 1.2%, and 3.1% and on ComplexWebQuestions by 1.8%, 1.1%, and 2.4% in F1, Hits@1, and Accuracy, respectively.
Related papers
- iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering [6.4524748618007415]
iQUEST is a question-guided KBQA framework that iteratively decomposes complex queries into simpler sub-questions.<n>We integrate a Graph Neural Network (GNN) to look ahead and incorporate 2-hop neighbor information at each reasoning step.
arXiv Detail & Related papers (2025-06-02T15:30:02Z) - Knowledge Graph-extended Retrieval Augmented Generation for Question Answering [10.49712834719005]
This paper proposes a system that integrates Large Language Models (LLMs) and Knowledge Graphs (KGs) without requiring training.<n>The resulting approach can be classified as a specific form of a Retrieval Augmented Generation (RAG) with a KG.<n>It includes a question decomposition module to enhance multi-hop information retrieval and answerability.
arXiv Detail & Related papers (2025-04-11T18:03:02Z) - Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models [51.47994645529258]
We propose Question-Aware Knowledge Graph Prompting (QAP), which incorporates question embeddings into GNN aggregation to dynamically assess KG relevance.<n> Experimental results demonstrate that QAP outperforms state-of-the-art methods across multiple datasets, highlighting its effectiveness.
arXiv Detail & Related papers (2025-03-30T17:09:11Z) - Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation [81.18701211912779]
We introduce an Adaptive Multi-Aspect Retrieval-augmented over KGs (Amar) framework.<n>This method retrieves knowledge including entities, relations, and subgraphs, and converts each piece of retrieved text into prompt embeddings.<n>Our method has achieved state-of-the-art performance on two common datasets.
arXiv Detail & Related papers (2024-12-24T16:38:04Z) - SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation [6.568733377722896]
We propose a novel Similar Graph Enhanced Retrieval-Augmented Generation (SimGRAG) method.<n>It effectively addresses the challenge of aligning query texts and knowledge graphs.<n>SimGRAG outperforms state-of-the-art KG-driven RAG methods in question answering and fact verification.
arXiv Detail & Related papers (2024-12-17T15:40:08Z) - Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent [92.5712549836791]
Multimodal Retrieval Augmented Generation (mRAG) plays an important role in mitigating the "hallucination" issue inherent in multimodal large language models (MLLMs)<n>We propose the first self-adaptive planning agent for multimodal retrieval, OmniSearch.
arXiv Detail & Related papers (2024-11-05T09:27:21Z) - Retrieval-Augmented Language Model for Extreme Multi-Label Knowledge Graph Link Prediction [2.6749568255705656]
Extrapolation in large language models (LLMs) for open-ended inquiry encounters two pivotal issues.
Existing works attempt to tackle the problem by augmenting the input of a smaller language model with information from a knowledge graph.
We propose a new task, the extreme multi-label KG link prediction task, to enable a model to perform extrapolation with multiple responses.
arXiv Detail & Related papers (2024-05-21T10:10:56Z) - ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained
Language Models for Question Answering over Knowledge Graph [142.42275983201978]
We propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning.
We also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions.
Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data.
arXiv Detail & Related papers (2023-12-30T07:18:54Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Multi-hop Question Generation with Graph Convolutional Network [58.31752179830959]
Multi-hop Question Generation (QG) aims to generate answer-related questions by aggregating and reasoning over multiple scattered evidence from different paragraphs.
We propose Multi-Hop volution Fusion Network for Question Generation (MulQG), which does context encoding in multiple hops.
Our proposed model is able to generate fluent questions with high completeness and outperforms the strongest baseline by 20.8% in the multi-hop evaluation.
arXiv Detail & Related papers (2020-10-19T06:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.