PIE-QG: Paraphrased Information Extraction for Unsupervised Question
Generation from Small Corpora
- URL: http://arxiv.org/abs/2301.01064v1
- Date: Tue, 3 Jan 2023 12:20:51 GMT
- Title: PIE-QG: Paraphrased Information Extraction for Unsupervised Question
Generation from Small Corpora
- Authors: Dinesh Nagumothu, Bahadorreza Ofoghi, Guangyan Huang, Peter W. Eklund
- Abstract summary: PIE-QG uses Open Information Extraction (OpenIE) to generate synthetic training questions from paraphrased passages.
Triples in the form of subject, predicate, object> are extracted from each passage, and questions are formed with subjects (or objects) and predicates while objects (or subjects) are considered as answers.
- Score: 4.721845865189576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised Question Answering systems (QA systems) rely on domain-specific
human-labeled data for training. Unsupervised QA systems generate their own
question-answer training pairs, typically using secondary knowledge sources to
achieve this outcome. Our approach (called PIE-QG) uses Open Information
Extraction (OpenIE) to generate synthetic training questions from paraphrased
passages and uses the question-answer pairs as training data for a language
model for a state-of-the-art QA system based on BERT. Triples in the form of
<subject, predicate, object> are extracted from each passage, and questions are
formed with subjects (or objects) and predicates while objects (or subjects)
are considered as answers. Experimenting on five extractive QA datasets
demonstrates that our technique achieves on-par performance with existing
state-of-the-art QA systems with the benefit of being trained on an order of
magnitude fewer documents and without any recourse to external reference data
sources.
Related papers
- Automatic Question-Answer Generation for Long-Tail Knowledge [65.11554185687258]
We propose an automatic approach to generate specialized QA datasets for tail entities.
We conduct extensive experiments by employing pretrained LLMs on our newly generated long-tail QA datasets.
arXiv Detail & Related papers (2024-03-03T03:06:31Z) - Open-domain Question Answering via Chain of Reasoning over Heterogeneous
Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set.
Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z) - Discourse Analysis via Questions and Answers: Parsing Dependency
Structures of Questions Under Discussion [57.43781399856913]
This work adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis.
We characterize relationships between sentences as free-form questions, in contrast to exhaustive fine-grained questions.
We develop the first-of-its-kind QUD that derives a dependency structure of questions over full documents.
arXiv Detail & Related papers (2022-10-12T03:53:12Z) - Questions Are All You Need to Train a Dense Passage Retriever [123.13872383489172]
ART is a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
It uses a new document-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence documents, and (2) the documents are then used to compute the probability of reconstructing the original question.
arXiv Detail & Related papers (2022-06-21T18:16:31Z) - QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse.
We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z) - Improving Unsupervised Question Answering via Summarization-Informed
Question Generation [47.96911338198302]
Question Generation (QG) is the task of generating a plausible question for a passage, answer> pair.
We make use of freely available news summary data, transforming declarative sentences into appropriate questions using dependency parsing, named entity recognition and semantic role labeling.
The resulting questions are then combined with the original news articles to train an end-to-end neural QG model.
arXiv Detail & Related papers (2021-09-16T13:08:43Z) - EQG-RACE: Examination-Type Question Generation [21.17100754955864]
We propose an innovative Examination-type Question Generation approach (EQG-RACE) to generate exam-like questions based on a dataset extracted from RACE.
Two main strategies are employed in EQG-RACE for dealing with discrete answer information and reasoning among long contexts.
Experimental results show a state-of-the-art performance of EQG-RACE, which is apparently superior to the baselines.
arXiv Detail & Related papers (2020-12-11T03:52:17Z) - Summary-Oriented Question Generation for Informational Queries [23.72999724312676]
We aim to produce self-explanatory questions that focus on main document topics and are answerable with variable length passages as appropriate.
Our model shows SOTA performance of SQ generation on the NQ dataset (20.1 BLEU-4).
We further apply our model on out-of-domain news articles, evaluating with a QA system due to the lack of gold questions and demonstrate that our model produces better SQs for news articles -- with further confirmation via a human evaluation.
arXiv Detail & Related papers (2020-10-19T17:30:08Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.