The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained
Sequence-to-Sequence Models
- URL: http://arxiv.org/abs/2101.05667v1
- Date: Thu, 14 Jan 2021 15:29:54 GMT
- Title: The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained
Sequence-to-Sequence Models
- Authors: Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin
- Abstract summary: We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo"
At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture.
We present experimental results from the MS MARCO passage and document ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge that validate our design.
- Score: 34.94331039746062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a design pattern for tackling text ranking problems, dubbed
"Expando-Mono-Duo", that has been empirically validated for a number of ad hoc
retrieval tasks in different domains. At the core, our design relies on
pretrained sequence-to-sequence models within a standard multi-stage ranking
architecture. "Expando" refers to the use of document expansion techniques to
enrich keyword representations of texts prior to inverted indexing. "Mono" and
"Duo" refer to components in a reranking pipeline based on a pointwise model
and a pairwise model that rerank initial candidates retrieved using keyword
search. We present experimental results from the MS MARCO passage and document
ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge
that validate our design. In all these tasks, we achieve effectiveness that is
at or near the state of the art, in some cases using a zero-shot approach that
does not exploit any training data from the target task. To support
replicability, implementations of our design pattern are open-sourced in the
Pyserini IR toolkit and PyGaggle neural reranking library.
Related papers
- Generative Compositor for Few-Shot Visual Information Extraction [60.663887314625164]
We propose a novel generative model, named Generative generative spatialtor, to address the challenge of few-shot VIE.
Generative generative spatialtor is a hybrid pointer-generator network that emulates the operations of a compositor by retrieving words from the source text.
The proposed method achieves highly competitive results in the full-sample training, while notably outperforms the baseline in the 1-shot, 5-shot, and 10-shot settings.
arXiv Detail & Related papers (2025-03-21T04:56:24Z) - CorpusLM: Towards a Unified Language Model on Corpus for Knowledge-Intensive Tasks [20.390672895839757]
Retrieval-augmented generation (RAG) has emerged as a popular solution to enhance factual accuracy.
Traditional retrieval modules often rely on large document index and disconnect with generative tasks.
We propose textbfCorpusLM, a unified language model that integrates generative retrieval, closed-book generation, and RAG.
arXiv Detail & Related papers (2024-02-02T06:44:22Z) - KOPPA: Improving Prompt-based Continual Learning with Key-Query Orthogonal Projection and Prototype-based One-Versus-All [24.50129285997307]
We introduce a novel key-query learning strategy to enhance prompt matching efficiency and address the challenge of shifting features.
Our method empowers the model to achieve results surpassing those of current state-of-the-art approaches by a large margin of up to 20%.
arXiv Detail & Related papers (2023-11-26T20:35:19Z) - Continual Referring Expression Comprehension via Dual Modular
Memorization [133.46886428655426]
Referring Expression (REC) aims to localize an image region of a given object described by a natural-language expression.
Existing REC algorithms make a strong assumption that training data feeding into a model are given upfront, which degrades its practicality for real-world scenarios.
In this paper, we propose Continual Referring Expression (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks.
In order to continuously improve the model on sequential tasks without forgetting prior learned knowledge and without repeatedly re-training from a scratch, we propose an effective baseline method named Dual Modular Memorization
arXiv Detail & Related papers (2023-11-25T02:58:51Z) - Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.