FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base
Question Answering
- URL: http://arxiv.org/abs/2308.12060v3
- Date: Fri, 26 Jan 2024 12:49:04 GMT
- Title: FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base
Question Answering
- Authors: Zhenyu Li, Sunqi Fan, Yu Gu, Xiuxing Li, Zhichao Duan, Bowen Dong,
Ning Liu, Jianyong Wang
- Abstract summary: We introduce FlexKBQA to mitigate the burden associated with manual annotation.
We leverage Large Language Models (LLMs) as program translators for addressing the challenges inherent in the few-shot KBQA task.
Specifically, FlexKBQA leverages automated algorithms to sample diverse programs, such as SPARQL queries, from the knowledge base.
We observe that under the few-shot even the more challenging zero-shot scenarios, FlexKBQA achieves impressive results with a few annotations.
- Score: 16.88132219032486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge base question answering (KBQA) is a critical yet challenging task
due to the vast number of entities within knowledge bases and the diversity of
natural language questions posed by users. Unfortunately, the performance of
most KBQA models tends to decline significantly in real-world scenarios where
high-quality annotated data is insufficient. To mitigate the burden associated
with manual annotation, we introduce FlexKBQA by utilizing Large Language
Models (LLMs) as program translators for addressing the challenges inherent in
the few-shot KBQA task. Specifically, FlexKBQA leverages automated algorithms
to sample diverse programs, such as SPARQL queries, from the knowledge base,
which are subsequently converted into natural language questions via LLMs. This
synthetic dataset facilitates training a specialized lightweight model for the
KB. Additionally, to reduce the barriers of distribution shift between
synthetic data and real user questions, FlexKBQA introduces an executionguided
self-training method to iterative leverage unlabeled user questions.
Furthermore, we explore harnessing the inherent reasoning capability of LLMs to
enhance the entire framework. Consequently, FlexKBQA delivers substantial
flexibility, encompassing data annotation, deployment, and being domain
agnostic. Through extensive experiments on GrailQA, WebQSP, and KQA Pro, we
observe that under the few-shot even the more challenging zero-shot scenarios,
FlexKBQA achieves impressive results with a few annotations, surpassing all
previous baselines and even approaching the performance of supervised models,
achieving a remarkable 93% performance relative to the fully-supervised models.
We posit that FlexKBQA represents a significant advancement towards exploring
better integration of large and lightweight models. The code is open-sourced.
Related papers
- IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization [59.06663981902496]
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.
We investigate two indispensable characteristics that the LLMs-based QFS models should be harnessed, Lengthy Document Summarization and Efficiently Fine-grained Query-LLM Alignment.
These innovations pave the way for broader application and accessibility in the field of QFS technology.
arXiv Detail & Related papers (2024-07-15T07:14:56Z) - Robust Few-shot Transfer Learning for Knowledge Base Question Answering with Unanswerable Questions [22.411601767105807]
We present FUn-FuSIC that extends the state-of-the-art (SoTA) few-shot transfer model for answerable-only KBQA to handle unanswerability.
Experiments over newly constructed datasets show that FUn-FuSIC outperforms suitable adaptations of the SoTA model for KBQA with unanswerability.
arXiv Detail & Related papers (2024-06-20T13:43:38Z) - Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks.
We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM.
We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z) - Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales.
A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Automatic Question-Answer Generation for Long-Tail Knowledge [65.11554185687258]
We propose an automatic approach to generate specialized QA datasets for tail entities.
We conduct extensive experiments by employing pretrained LLMs on our newly generated long-tail QA datasets.
arXiv Detail & Related papers (2024-03-03T03:06:31Z) - Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models [7.399563588835834]
Interactive-KBQA is a framework designed to generate logical forms through direct interaction with knowledge bases (KBs)
Our method achieves competitive results on the WebQuestionsSP, ComplexWebQuestions, KQA Pro, and MetaQA datasets.
arXiv Detail & Related papers (2024-02-23T06:32:18Z) - Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning [20.80841972133938]
Existing Knowledge Base Question Answering (KBQA) architectures are hungry for annotated data.
We introduce the problem of few-shot transfer learning for KBQA, where the target domain offers only a few labeled examples.
We propose a novel KBQA architecture called FuSIC-KBQA that performs KB-retrieval using multiple source-trained retrievers.
arXiv Detail & Related papers (2023-11-15T11:56:56Z) - ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models [19.85526116658481]
We introduce ChatKBQA, a novel and simple generate-then-retrieve KBQA framework.
Experimental results show that ChatKBQA achieves new state-of-the-art performance on standard KBQA datasets.
This work can also be regarded as a new paradigm for combining LLMs with knowledge graphs for interpretable and knowledge-required question answering.
arXiv Detail & Related papers (2023-10-13T09:45:14Z) - Make a Choice! Knowledge Base Question Answering with In-Context
Learning [1.7827767384590838]
Question answering over knowledge bases (KBQA) aims to answer factoid questions with a given knowledge base (KB)
Due to the large scale of KB, annotated data is impossible to cover all fact schemas in KB.
We present McL-KBQA, a framework that incorporates the few-shot ability of LLM into the KBQA method via ICL-based multiple choice.
arXiv Detail & Related papers (2023-05-23T11:56:03Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.