Beyond I.I.D.: Three Levels of Generalization for Question Answering on
Knowledge Bases
- URL: http://arxiv.org/abs/2011.07743v6
- Date: Mon, 22 Feb 2021 19:04:45 GMT
- Title: Beyond I.I.D.: Three Levels of Generalization for Question Answering on
Knowledge Bases
- Authors: Yu Gu, Sue Kase, Michelle Vanni, Brian Sadler, Percy Liang, Xifeng
Yan, Yu Su
- Abstract summary: We release a new large-scale, high-quality dataset with 64,331 questions, GrailQA.
We propose a novel BERT-based KBQA model.
The combination of our dataset and model enables us to thoroughly examine and demonstrate, for the first time, the key role of pre-trained contextual embeddings like BERT in the generalization of KBQA.
- Score: 63.43418760818188
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing studies on question answering on knowledge bases (KBQA) mainly
operate with the standard i.i.d assumption, i.e., training distribution over
questions is the same as the test distribution. However, i.i.d may be neither
reasonably achievable nor desirable on large-scale KBs because 1) true user
distribution is hard to capture and 2) randomly sample training examples from
the enormous space would be highly data-inefficient. Instead, we suggest that
KBQA models should have three levels of built-in generalization: i.i.d,
compositional, and zero-shot. To facilitate the development of KBQA models with
stronger generalization, we construct and release a new large-scale,
high-quality dataset with 64,331 questions, GrailQA, and provide evaluation
settings for all three levels of generalization. In addition, we propose a
novel BERT-based KBQA model. The combination of our dataset and model enables
us to thoroughly examine and demonstrate, for the first time, the key role of
pre-trained contextual embeddings like BERT in the generalization of KBQA.
Related papers
- A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering [17.281005999581865]
Large-scale knowledge bases (KBs) like Freebase and Wikidata house millions of structured knowledge.
Knowledge Base Question Answering (KBQA) provides a user-friendly way to access these valuable KBs via asking natural language questions.
This paper develops KBLLaMA, which follows a learn-then-reason framework to inject new KB knowledge into a large language model for flexible end-to-end KBQA.
arXiv Detail & Related papers (2024-06-20T22:22:41Z) - Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning [20.80841972133938]
Existing Knowledge Base Question Answering (KBQA) architectures are hungry for annotated data.
We introduce the problem of few-shot transfer learning for KBQA, where the target domain offers only a few labeled examples.
We propose a novel KBQA architecture called FuSIC-KBQA that performs KB-retrieval using multiple source-trained retrievers.
arXiv Detail & Related papers (2023-11-15T11:56:56Z) - QASnowball: An Iterative Bootstrapping Framework for High-Quality
Question-Answering Data Generation [67.27999343730224]
We introduce an iterative bootstrapping framework for QA data augmentation (named QASnowball)
QASnowball can iteratively generate large-scale high-quality QA data based on a seed set of supervised examples.
We conduct experiments in the high-resource English scenario and the medium-resource Chinese scenario, and the experimental results show that the data generated by QASnowball can facilitate QA models.
arXiv Detail & Related papers (2023-09-19T05:20:36Z) - Few-shot In-context Learning for Knowledge Base Question Answering [31.73274700847965]
We propose KB-BINDER, which for the first time enables few-shot in-context learning over KBQA tasks.
The experimental results on four public heterogeneous KBQA datasets show that KB-BINDER can achieve a strong performance with only a few in-context demonstrations.
arXiv Detail & Related papers (2023-05-02T19:31:55Z) - Knowledge Transfer from Answer Ranking to Answer Generation [97.38378660163414]
We propose to train a GenQA model by transferring knowledge from a trained AS2 model.
We also propose to use the AS2 model prediction scores for loss weighting and score-conditioned input/output shaping.
arXiv Detail & Related papers (2022-10-23T21:51:27Z) - SYGMA: System for Generalizable Modular Question Answering OverKnowledge
Bases [57.89642289610301]
We present SYGMA, a modular approach facilitating general-izability across multiple knowledge bases and multiple rea-soning types.
We demonstrate effectiveness of our system by evaluating on datasets belonging to two distinct knowledge bases,DBpedia and Wikidata.
arXiv Detail & Related papers (2021-09-28T01:57:56Z) - RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base
Question Answering [57.94658176442027]
We present RnG-KBQA, a Rank-and-Generate approach for KBQA.
We achieve new state-of-the-art results on GrailQA and WebQSP datasets.
arXiv Detail & Related papers (2021-09-17T17:58:28Z) - A Survey on Complex Question Answering over Knowledge Base: Recent
Advances and Challenges [71.4531144086568]
Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language questions.
Researchers have shifted their attention from simple questions to complex questions, which require more KB triples and constraint inference.
arXiv Detail & Related papers (2020-07-26T07:13:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.