Momentum Contrastive Pre-training for Question Answering
- URL: http://arxiv.org/abs/2212.05762v3
- Date: Sat, 14 Oct 2023 08:44:35 GMT
- Title: Momentum Contrastive Pre-training for Question Answering
- Authors: Minda Hu, Muzhi Li, Yasheng Wang and Irwin King
- Abstract summary: MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs.
Our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
- Score: 54.57078061878619
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing pre-training methods for extractive Question Answering (QA) generate
cloze-like queries different from natural questions in syntax structure, which
could overfit pre-trained models to simple keyword matching. In order to
address this problem, we propose a novel Momentum Contrastive pRe-training fOr
queStion anSwering (MCROSS) method for extractive QA. Specifically, MCROSS
introduces a momentum contrastive learning framework to align the answer
probability between cloze-like and natural query-passage sample pairs. Hence,
the pre-trained models can better transfer the knowledge learned in cloze-like
samples to answering natural questions. Experimental results on three
benchmarking QA datasets show that our method achieves noticeable improvement
compared with all baselines in both supervised and zero-shot scenarios.
Related papers
- An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - Federated Prompting and Chain-of-Thought Reasoning for Improving LLMs
Answering [13.735277588793997]
We investigate how to enhance answer precision in frequently asked questions posed by distributed users using cloud-based Large Language Models (LLMs)
Our study focuses on a typical situations where users ask similar queries that involve identical mathematical reasoning steps and problem-solving procedures.
We propose to improve the distributed synonymous questions using Self-Consistency (SC) and Chain-of-Thought (CoT) techniques.
arXiv Detail & Related papers (2023-04-27T01:48:03Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - Learning with Instance Bundles for Reading Comprehension [61.823444215188296]
We introduce new supervision techniques that compare question-answer scores across multiple related instances.
Specifically, we normalize these scores across various neighborhoods of closely contrasting questions and/or answers.
We empirically demonstrate the effectiveness of training with instance bundles on two datasets.
arXiv Detail & Related papers (2021-04-18T06:17:54Z) - Cooperative Learning of Zero-Shot Machine Reading Comprehension [9.868221447090855]
We propose a cooperative, self-play learning model for question generation and answering.
We can train question generation and answering models on any textual corpora without annotation.
Our model outperforms the state-of-the-art pretrained language models on standard question answering benchmarks.
arXiv Detail & Related papers (2021-03-12T18:22:28Z) - Counterfactual Variable Control for Robust and Interpretable Question
Answering [57.25261576239862]
Deep neural network based question answering (QA) models are neither robust nor explainable in many cases.
In this paper, we inspect such spurious "capability" of QA models using causal inference.
We propose a novel approach called Counterfactual Variable Control (CVC) that explicitly mitigates any shortcut correlation.
arXiv Detail & Related papers (2020-10-12T10:09:05Z) - MUTANT: A Training Paradigm for Out-of-Distribution Generalization in
Visual Question Answering [58.30291671877342]
We present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input.
MUTANT establishes a new state-of-the-art accuracy on VQA-CP with a $10.57%$ improvement.
arXiv Detail & Related papers (2020-09-18T00:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.