SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation
- URL: http://arxiv.org/abs/2404.01923v1
- Date: Tue, 2 Apr 2024 13:17:36 GMT
- Title: SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation
- Authors: Shasha Guo, Lizi Liao, Jing Zhang, Yanling Wang, Cuiping Li, Hong Chen,
- Abstract summary: Knowledge base question generation (KBQG) aims to generate natural language questions from a set of triplet facts extracted from KB.
With the advance of pre-training techniques, large language models (LLMs) (e.g., GPT-3.5) undoubtedly possess much more semantic knowledge.
We propose SGSH--a simple and effective framework to incentivize GPT-3.5 with Skeleton Heuristics to enhance KBQG.
- Score: 23.426821153086358
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge base question generation (KBQG) aims to generate natural language questions from a set of triplet facts extracted from KB. Existing methods have significantly boosted the performance of KBQG via pre-trained language models (PLMs) thanks to the richly endowed semantic knowledge. With the advance of pre-training techniques, large language models (LLMs) (e.g., GPT-3.5) undoubtedly possess much more semantic knowledge. Therefore, how to effectively organize and exploit the abundant knowledge for KBQG becomes the focus of our study. In this work, we propose SGSH--a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. The framework incorporates "skeleton heuristics", which provides more fine-grained guidance associated with each input to stimulate LLMs to generate optimal questions, encompassing essential elements like the question phrase and the auxiliary verb.More specifically, we devise an automatic data construction strategy leveraging ChatGPT to construct a skeleton training dataset, based on which we employ a soft prompting approach to train a BART model dedicated to generating the skeleton associated with each input. Subsequently, skeleton heuristics are encoded into the prompt to incentivize GPT-3.5 to generate desired questions. Extensive experiments demonstrate that SGSH derives the new state-of-the-art performance on the KBQG tasks.
Related papers
- Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains [66.55612528039894]
Knowledge Graphs (KGs) can serve as reliable knowledge sources for question answering (QA)
We present DoG (Decoding on Graphs), a novel framework that facilitates a deep synergy between LLMs and KGs.
Experiments across various KGQA tasks with different background KGs demonstrate that DoG achieves superior and robust performance.
arXiv Detail & Related papers (2024-10-24T04:01:40Z) - A Knowledge-Injected Curriculum Pretraining Framework for Question Answering [70.13026036388794]
We propose a general Knowledge-Injected Curriculum Pretraining framework (KICP) to achieve comprehensive KG learning and exploitation for Knowledge-based question answering tasks.
The KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps.
The KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability.
The CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner.
arXiv Detail & Related papers (2024-03-11T03:42:03Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained
Language Models for Question Answering over Knowledge Graph [142.42275983201978]
We propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning.
We also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions.
Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data.
arXiv Detail & Related papers (2023-12-30T07:18:54Z) - ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models [19.85526116658481]
We introduce ChatKBQA, a novel and simple generate-then-retrieve KBQA framework.
Experimental results show that ChatKBQA achieves new state-of-the-art performance on standard KBQA datasets.
This work can also be regarded as a new paradigm for combining LLMs with knowledge graphs for interpretable and knowledge-required question answering.
arXiv Detail & Related papers (2023-10-13T09:45:14Z) - Prompting Large Language Models with Chain-of-Thought for Few-Shot
Knowledge Base Question Generation [19.327008532572645]
Question Generation over Knowledge Bases (KBQG) aims to convert a logical form into a natural language question.
We propose Chain-of-Thought prompting, which is an in-context learning strategy for reasoning.
We conduct extensive experiments over three public KBQG datasets.
arXiv Detail & Related papers (2023-10-12T15:08:14Z) - Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA)
First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios.
Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository.
Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z) - Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware
Pre-training for KBQA [28.642711264323786]
We propose a Structured Knowledge-aware Pre-training method (SKP) to bridge the gap between texts and structured KBs.
In the pre-training stage, we introduce two novel structured knowledge-aware tasks, guiding the model to effectively learn the implicit relationship and better representations of complex subgraphs.
In the downstream KBQA task, we further design an efficient linearization strategy and an interval attention mechanism, which assist the model to better encode complex subgraphs.
arXiv Detail & Related papers (2023-08-28T09:22:02Z) - UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding
with Text-to-Text Language Models [170.88745906220174]
We propose the SKG framework, which unifies 21 SKG tasks into a text-to-text format.
We show that UnifiedSKG achieves state-of-the-art performance on almost all of the 21 tasks.
We also use UnifiedSKG to conduct a series of experiments on structured knowledge encoding variants across SKG tasks.
arXiv Detail & Related papers (2022-01-16T04:36:18Z) - Calculating Question Similarity is Enough:A New Method for KBQA Tasks [8.056701645706404]
This paper proposes a Corpus Generation - Retrieve Method (CGRM) with Pre-training Language Model (PLM) and Knowledge Graph (KG)
Firstly, based on the mT5 model, we designed two new pre-training tasks: knowledge masked language modeling and question generation based on the paragraph.
Secondly, after preprocessing triples of knowledge graph with a series of rules, the kT5 model generates natural language QA pairs based on processed triples.
arXiv Detail & Related papers (2021-11-15T10:31:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.