ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking
Inference
- URL: http://arxiv.org/abs/2204.11458v1
- Date: Mon, 25 Apr 2022 06:26:29 GMT
- Title: ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking
Inference
- Authors: Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji
Ma, Jai Prakash Gupta, Cicero Nogueira dos Santos, Yi Tay, Don Metzler
- Abstract summary: This paper proposes a new training and inference paradigm for re-ranking.
We finetune a pretrained encoder-decoder model using in the form of document to query generation.
We show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference.
- Score: 70.36083572306839
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art neural models typically encode document-query pairs using
cross-attention for re-ranking. To this end, models generally utilize an
encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach.
These paradigms, however, are not without flaws, i.e., running the model on all
query-document pairs at inference-time incurs a significant computational cost.
This paper proposes a new training and inference paradigm for re-ranking. We
propose to finetune a pretrained encoder-decoder model using in the form of
document to query generation. Subsequently, we show that this encoder-decoder
architecture can be decomposed into a decoder-only language model during
inference. This results in significant inference time speedups since the
decoder-only architecture only needs to learn to interpret static encoder
embeddings during inference. Our experiments show that this new paradigm
achieves results that are comparable to the more expensive cross-attention
ranking approaches while being up to 6.8X faster. We believe this work paves
the way for more efficient neural rankers that leverage large pretrained
models.
Related papers
- Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference [95.42299246592756]
We study the UNet encoder and empirically analyze the encoder features.
We find that encoder features change minimally, whereas the decoder features exhibit substantial variations across different time-steps.
We validate our approach on other tasks: text-to-video, personalized generation and reference-guided generation.
arXiv Detail & Related papers (2023-12-15T08:46:43Z) - Hierarchical Attention Encoder Decoder [2.4366811507669115]
Autoregressive modeling can generate complex and novel sequences that have many real-world applications.
These models must generate outputs autoregressively, which becomes time-consuming when dealing with long sequences.
We propose a model based on the Hierarchical Recurrent Decoder architecture.
arXiv Detail & Related papers (2023-06-01T18:17:23Z) - Improving Code Search with Hard Negative Sampling Based on Fine-tuning [15.341959871682981]
We introduce a cross-encoder architecture for code search that jointly encodes the concatenation of query and code.
We also introduce a Retriever-Ranker (RR) framework that cascades the dual-encoder and cross-encoder to promote the efficiency of evaluation and online serving.
arXiv Detail & Related papers (2023-05-08T07:04:28Z) - Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
Regularized Encoder-Decoder [75.03283861464365]
The seq2seq task aims at generating the target sequence based on the given input source sequence.
Traditionally, most of the seq2seq task is resolved by an encoder to encode the source sequence and a decoder to generate the target text.
Recently, a bunch of new approaches have emerged that apply decoder-only language models directly to the seq2seq task.
arXiv Detail & Related papers (2023-04-08T15:44:29Z) - Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired
Speech Data [145.95460945321253]
We introduce two pre-training tasks for the encoder-decoder network using acoustic units, i.e., pseudo codes.
The proposed Speech2C can relatively reduce the word error rate (WER) by 19.2% over the method without decoder pre-training.
arXiv Detail & Related papers (2022-03-31T15:33:56Z) - UniXcoder: Unified Cross-Modal Pre-training for Code Representation [65.6846553962117]
We present UniXcoder, a unified cross-modal pre-trained model for programming language.
We propose a one-to-one mapping method to transform AST in a sequence structure that retains all structural information from the tree.
We evaluate UniXcoder on five code-related tasks over nine datasets.
arXiv Detail & Related papers (2022-03-08T04:48:07Z) - Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder.
We train a Transformer-based sequence encoder over a large set of short sequences.
Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z) - Regularized Forward-Backward Decoder for Attention Models [5.257115841810258]
We propose a novel regularization technique incorporating a second decoder during the training phase.
This decoder is optimized on time-reversed target labels beforehand and supports the standard decoder during training by adding knowledge from future context.
We evaluate our approach on the smaller TEDLIUMv2 and the larger LibriSpeech dataset, achieving consistent improvements on both of them.
arXiv Detail & Related papers (2020-06-15T16:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.