Related papers: Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation

Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation

URL: http://arxiv.org/abs/2106.11096v1
Date: Mon, 21 Jun 2021 13:29:43 GMT
Title: Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation
Authors: Yang Deng, Wenxuan Zhang, Wai Lam
Abstract summary: We propose a novel and easy-to-apply data augmentation strategy, namely Bilateral Generation (BiG) With the augmented dataset, we design a contrastive training objective for learning to rank question answer pairs. Experimental results on three benchmark datasets, namely TREC-QA, WikiQA, and ANTIQUE, show that our method significantly improves the performance of ranking models.
Score: 39.22166065525888
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we propose a novel and easy-to-apply data augmentation strategy, namely Bilateral Generation (BiG), with a contrastive training objective for improving the performance of ranking question answer pairs with existing labeled data. In specific, we synthesize pseudo-positive QA pairs in contrast to the original negative QA pairs with two pre-trained generation models, one for question generation, the other for answer generation, which are fine-tuned on the limited positive QA pairs from the original dataset. With the augmented dataset, we design a contrastive training objective for learning to rank question answer pairs. Experimental results on three benchmark datasets, namely TREC-QA, WikiQA, and ANTIQUE, show that our method significantly improves the performance of ranking models by making full use of existing labeled data and can be easily applied to different ranking models.

Related papers

An Empirical Comparison of LM-based Question and Answer Generation Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z)
Learning Answer Generation using Supervision from Automatic Question Answering Evaluators [98.9267570170737]
We propose a novel training paradigm for GenQA using supervision from automatic QA evaluation models (GAVA) We evaluate our proposed methods on two academic and one industrial dataset, obtaining a significant improvement in answering accuracy over the previous state of the art.
arXiv Detail & Related papers (2023-05-24T16:57:04Z)
Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning [7.2745835227138045]
We propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. This dataset is used to learn a QA pair generation model producing summaries as answers that balance brevity with sufficiency jointly with their corresponding questions.
arXiv Detail & Related papers (2021-09-10T06:34:55Z)
Contrast and Classify: Training Robust VQA Models [60.80627814762071]
We propose a novel training paradigm (ConClaT) that optimize both cross-entropy and contrastive losses. We find that optimizing both losses -- either alternately or jointly -- is key to effective training.
arXiv Detail & Related papers (2020-10-13T00:23:59Z)
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts. Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA) First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA) Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.