Exploring Dual Encoder Architectures for Question Answering
- URL: http://arxiv.org/abs/2204.07120v1
- Date: Thu, 14 Apr 2022 17:21:14 GMT
- Title: Exploring Dual Encoder Architectures for Question Answering
- Authors: Zhe Dong, Jianmo Ni, Dan Bikel, Enrique Alfonseca, Yuan Wang, Chen Qu,
Imed Zitouni
- Abstract summary: Dual encoders have been used for question-answering (QA) and information retrieval (IR) tasks with good results.
There are two major types of dual encoders, Siamese Duals (SDE) and Asymmetric Dual architectures (ADE)
- Score: 17.59582094233306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dual encoders have been used for question-answering (QA) and information
retrieval (IR) tasks with good results. There are two major types of dual
encoders, Siamese Dual Encoders (SDE), with parameters shared across two
encoders, and Asymmetric Dual Encoder (ADE), with two distinctly parameterized
encoders. In this work, we explore the dual encoder architectures for QA
retrieval tasks. By evaluating on MS MARCO and the MultiReQA benchmark, we show
that SDE performs significantly better than ADE. We further propose three
different improved versions of ADEs. Based on the evaluation of QA retrieval
tasks and direct analysis of the embeddings, we demonstrate that sharing
parameters in projection layers would enable ADEs to perform competitively with
SDEs.
Related papers
- SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models
with Same Tower Negatives [4.864332428224798]
A standard way to train dual encoders is using a contrastive loss with in-batch negatives.
In this work, we propose an improved contrastive learning objective by adding queries or documents from the same encoder towers to the negatives.
We demonstrate that SamToNe can effectively improve the retrieval quality for both symmetric and asymmetric dual encoders.
arXiv Detail & Related papers (2023-06-05T00:43:37Z) - A Symmetric Dual Encoding Dense Retrieval Framework for
Knowledge-Intensive Visual Question Answering [16.52970318866536]
Knowledge-Intensive Visual Question Answering (KI-VQA) refers to answering a question about an image whose answer does not lie in the image.
This paper presents a new pipeline for KI-VQA tasks, consisting of a retriever and a reader.
arXiv Detail & Related papers (2023-04-26T16:14:39Z) - Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler
Alignment of Embeddings for Asymmetrical dual encoders [89.29256833403169]
We introduce Kullback Leibler Alignment of Embeddings (KALE), an efficient and accurate method for increasing the inference efficiency of dense retrieval methods.
KALE extends traditional Knowledge Distillation after bi-encoder training, allowing for effective query encoder compression without full retraining or index generation.
Using KALE and asymmetric training, we can generate models which exceed the performance of DistilBERT despite having 3x faster inference.
arXiv Detail & Related papers (2023-03-31T15:44:13Z) - Task-Aware Specialization for Efficient and Robust Dense Retrieval for
Open-Domain Question Answering [85.08146789409354]
We propose a new architecture, Task-awaredomain for dense Retrieval (TASER)
TASER enables parameter sharing by interleaving shared and specialized blocks in a single encoder.
Our experiments show that TASER can achieve superior accuracy, surpassing BM25, while using about 60% of the parameters as bi-encoder dense retrievers.
arXiv Detail & Related papers (2022-10-11T05:33:25Z) - Revisiting Code Search in a Two-Stage Paradigm [67.02322603435628]
TOSS is a two-stage fusion code search framework.
It first uses IR-based and bi-encoder models to efficiently recall a small number of top-k code candidates.
It then uses fine-grained cross-encoders for finer ranking.
arXiv Detail & Related papers (2022-08-24T02:34:27Z) - Enhancing Dual-Encoders with Question and Answer Cross-Embeddings for
Answer Retrieval [29.16807969384253]
Dual-Encoders is a promising mechanism for answer retrieval in question answering (QA) systems.
We propose a framework to enhance the Dual-Encoders model with question answer cross-embeddings and a novel Geometry Alignment Mechanism (GAM)
Our framework significantly improves Dual-Encoders model and outperforms the state-of-the-art method on multiple answer retrieval datasets.
arXiv Detail & Related papers (2022-06-07T02:39:24Z) - LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text
Retrieval [117.15862403330121]
We propose LoopITR, which combines dual encoders and cross encoders in the same network for joint learning.
Specifically, we let the dual encoder provide hard negatives to the cross encoder, and use the more discriminative cross encoder to distill its predictions back to the dual encoder.
arXiv Detail & Related papers (2022-03-10T16:41:12Z) - Question Answering Infused Pre-training of General-Purpose
Contextualized Representations [70.62967781515127]
We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations.
We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model.
We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection.
arXiv Detail & Related papers (2021-06-15T14:45:15Z) - Dual-decoder Transformer for Joint Automatic Speech Recognition and
Multilingual Speech Translation [71.54816893482457]
We introduce dual-decoder Transformer, a new model architecture that jointly performs automatic speech recognition (ASR) and multilingual speech translation (ST)
Our models are based on the original Transformer architecture but consist of two decoders, each responsible for one task (ASR or ST)
arXiv Detail & Related papers (2020-11-02T04:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.