Question Answering Infused Pre-training of General-Purpose
Contextualized Representations
- URL: http://arxiv.org/abs/2106.08190v1
- Date: Tue, 15 Jun 2021 14:45:15 GMT
- Title: Question Answering Infused Pre-training of General-Purpose
Contextualized Representations
- Authors: Robin Jia, Mike Lewis, Luke Zettlemoyer
- Abstract summary: We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations.
We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model.
We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection.
- Score: 70.62967781515127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a pre-training objective based on question answering (QA)
for learning general-purpose contextual representations, motivated by the
intuition that the representation of a phrase in a passage should encode all
questions that the phrase can answer in context. We accomplish this goal by
training a bi-encoder QA model, which independently encodes passages and
questions, to match the predictions of a more accurate cross-encoder model on
80 million synthesized QA pairs. By encoding QA-relevant information, the
bi-encoder's token-level representations are useful for non-QA downstream tasks
without extensive (or in some cases, any) fine-tuning. We show large
improvements over both RoBERTa-large and previous state-of-the-art results on
zero-shot and few-shot paraphrase detection on four datasets, few-shot named
entity recognition on two datasets, and zero-shot sentiment analysis on three
datasets.
Related papers
- A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks [81.2624272756733]
In dense retrieval, deep encoders provide embeddings for both inputs and targets.
We train a small parametric corrector network that adjusts stale cached target embeddings.
Our approach matches state-of-the-art results even when no target embedding updates are made during training.
arXiv Detail & Related papers (2024-09-03T13:29:13Z) - FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection [61.9638234358049]
FastFiD is a novel approach that executes sentence selection on encoded passages.
This aids in retaining valuable sentences while reducing the context length required for generating answers.
arXiv Detail & Related papers (2024-08-12T17:50:02Z) - Single Sequence Prediction over Reasoning Graphs for Multi-hop QA [8.442412179333205]
We propose a single-sequence prediction method over a local reasoning graph (model)footnoteCode/Models.
We use a graph neural network to encode this graph structure and fuse the resulting representations into the entity representations of the model.
Our experiments show significant improvements in answer exact-match/F1 scores and faithfulness of grounding in the reasoning path.
arXiv Detail & Related papers (2023-07-01T13:15:09Z) - A Symmetric Dual Encoding Dense Retrieval Framework for
Knowledge-Intensive Visual Question Answering [16.52970318866536]
Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image.
This paper presents a new pipeline for KI-VQA tasks, consisting of a retriever and a reader.
arXiv Detail & Related papers (2023-04-26T16:14:39Z) - QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse.
We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z) - A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring
of Answer Transcriptions in Video Job Interviews [14.091472037847499]
We propose a Hierarchical Reasoning Graph Neural Network (HRGNN) for the automatic assessment of question-answer pairs.
We employ a semantic-level reasoning graph attention network to model the interaction states of the current QA session.
Finally, we propose a gated recurrent unit encoder to represent the temporal question-answer pairs for the final prediction.
arXiv Detail & Related papers (2020-12-22T12:27:45Z) - Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
Discrete Speech Representations [49.55361944105796]
We present a novel approach to any-to-one (A2O) voice conversion (VC) in a sequence-to-sequence framework.
A2O VC aims to convert any speaker, including those unseen during training, to a fixed target speaker.
arXiv Detail & Related papers (2020-10-23T08:34:52Z) - Structured Multimodal Attentions for TextVQA [57.71060302874151]
We propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above.
SMA first uses a structural graph representation to encode the object-object, object-text and text-text relationships appearing in the image, and then designs a multimodal graph attention network to reason over it.
Our proposed model outperforms the SoTA models on TextVQA dataset and two tasks of ST-VQA dataset among all models except pre-training based TAP.
arXiv Detail & Related papers (2020-06-01T07:07:36Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.