Related papers: Question Answering Infused Pre-training of General-Purpose Contextualized Representations

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

URL: http://arxiv.org/abs/2106.08190v1
Date: Tue, 15 Jun 2021 14:45:15 GMT
Title: Question Answering Infused Pre-training of General-Purpose Contextualized Representations
Authors: Robin Jia, Mike Lewis, Luke Zettlemoyer
Abstract summary: We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations. We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model. We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection.
Score: 70.62967781515127
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage should encode all questions that the phrase can answer in context. We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model on 80 million synthesized QA pairs. By encoding QA-relevant information, the bi-encoder's token-level representations are useful for non-QA downstream tasks without extensive (or in some cases, any) fine-tuning. We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection on four datasets, few-shot named entity recognition on two datasets, and zero-shot sentiment analysis on three datasets.

Related papers

A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks [81.2624272756733]
In dense retrieval, deep encoders provide embeddings for both inputs and targets. We train a small parametric corrector network that adjusts stale cached target embeddings. Our approach matches state-of-the-art results even when no target embedding updates are made during training.
arXiv Detail & Related papers (2024-09-03T13:29:13Z)
FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection [61.9638234358049]
FastFiD is a novel approach that executes sentence selection on encoded passages. This aids in retaining valuable sentences while reducing the context length required for generating answers.
arXiv Detail & Related papers (2024-08-12T17:50:02Z)
Single Sequence Prediction over Reasoning Graphs for Multi-hop QA [8.442412179333205]
We propose a single-sequence prediction method over a local reasoning graph (model)footnoteCode/Models. We use a graph neural network to encode this graph structure and fuse the resulting representations into the entity representations of the model. Our experiments show significant improvements in answer exact-match/F1 scores and faithfulness of grounding in the reasoning path.
arXiv Detail & Related papers (2023-07-01T13:15:09Z)
A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering [16.52970318866536]
Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image. This paper presents a new pipeline for KI-VQA tasks, consisting of a retriever and a reader.
arXiv Detail & Related papers (2023-04-26T16:14:39Z)
QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse. We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z)
A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews [14.091472037847499]
We propose a Hierarchical Reasoning Graph Neural Network (HRGNN) for the automatic assessment of question-answer pairs. We employ a semantic-level reasoning graph attention network to model the interaction states of the current QA session. Finally, we propose a gated recurrent unit encoder to represent the temporal question-answer pairs for the final prediction.
arXiv Detail & Related papers (2020-12-22T12:27:45Z)
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations [49.55361944105796]
We present a novel approach to any-to-one (A2O) voice conversion (VC) in a sequence-to-sequence framework. A2O VC aims to convert any speaker, including those unseen during training, to a fixed target speaker.
arXiv Detail & Related papers (2020-10-23T08:34:52Z)
Structured Multimodal Attentions for TextVQA [57.71060302874151]
We propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above. SMA first uses a structural graph representation to encode the object-object, object-text and text-text relationships appearing in the image, and then designs a multimodal graph attention network to reason over it. Our proposed model outperforms the SoTA models on TextVQA dataset and two tasks of ST-VQA dataset among all models except pre-training based TAP.
arXiv Detail & Related papers (2020-06-01T07:07:36Z)
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts. Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.