Curriculum Learning for Dense Retrieval Distillation
- URL: http://arxiv.org/abs/2204.13679v1
- Date: Thu, 28 Apr 2022 17:42:21 GMT
- Title: Curriculum Learning for Dense Retrieval Distillation
- Authors: Hansi Zeng, Hamed Zamani, Vishwa Vinay
- Abstract summary: We propose a generic curriculum learning based optimization framework called CL-DRD.
CL-DRD controls the difficulty level of training data produced by the re-ranking (teacher) model.
Experiments on three public passage retrieval datasets demonstrate the effectiveness of our proposed framework.
- Score: 20.25741148622744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has shown that more effective dense retrieval models can be
obtained by distilling ranking knowledge from an existing base re-ranking
model. In this paper, we propose a generic curriculum learning based
optimization framework called CL-DRD that controls the difficulty level of
training data produced by the re-ranking (teacher) model. CL-DRD iteratively
optimizes the dense retrieval (student) model by increasing the difficulty of
the knowledge distillation data made available to it. In more detail, we
initially provide the student model coarse-grained preference pairs between
documents in the teacher's ranking and progressively move towards finer-grained
pairwise document ordering requirements. In our experiments, we apply a simple
implementation of the CL-DRD framework to enhance two state-of-the-art dense
retrieval models. Experiments on three public passage retrieval datasets
demonstrate the effectiveness of our proposed framework.
Related papers
- Retrieval-Oriented Knowledge for Click-Through Rate Prediction [29.55757862617378]
This paper proposes a universal plug-and-play Retrieval-Oriented Knowledge (ROK) framework.
A knowledge base, consisting of a retrieval-oriented embedding layer and a knowledge encoder, is designed to preserve and imitate the retrieved & aggregated representations.
Experiments on three large-scale datasets show that ROK achieves competitive performance with the retrieval-based CTR models.
arXiv Detail & Related papers (2024-04-28T20:21:03Z) - LLM-Augmented Retrieval: Enhancing Retrieval Models Through Language Models and Doc-Level Embedding [2.0257616108612373]
This paper introduces a model-agnostic doc-level embedding framework through large language model augmentation.
We have been able to significantly improve the effectiveness of widely-used retriever models.
arXiv Detail & Related papers (2024-04-08T19:29:07Z) - Distillation Enhanced Generative Retrieval [96.69326099136289]
Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target.
In this work, we identify a viable direction to further enhance generative retrieval via distillation and propose a feasible framework, named DGR.
We conduct experiments on four public datasets, and the results indicate that DGR achieves state-of-the-art performance among the generative retrieval methods.
arXiv Detail & Related papers (2024-02-16T15:48:24Z) - Let All be Whitened: Multi-teacher Distillation for Efficient Visual
Retrieval [57.17075479691486]
We propose a multi-teacher distillation framework Whiten-MTD, which is able to transfer knowledge from off-the-shelf pre-trained retrieval models to a lightweight student model for efficient visual retrieval.
Our source code is released at https://github.com/Maryeon/whiten_mtd.
arXiv Detail & Related papers (2023-12-15T11:43:56Z) - Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data.
Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - Debias the Black-box: A Fair Ranking Framework via Knowledge
Distillation [26.60241524303918]
We propose a fair information retrieval framework based on knowledge distillation.
This framework can improve the exposure-based fairness of models while considerably decreasing model size.
It also improves fairness performance by 15%46% while keeping a high level of recommendation effectiveness.
arXiv Detail & Related papers (2022-08-24T15:59:58Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval [11.38022203865326]
SPLADE model provides highly sparse representations and competitive results with respect to state-of-the-art dense and sparse approaches.
We modify the pooling mechanism, benchmark a model solely based on document expansion, and introduce models trained with distillation.
Overall, SPLADE is considerably improved with more than $9$% gains on NDCG@10 on TREC DL 2019, leading to state-of-the-art results on the BEIR benchmark.
arXiv Detail & Related papers (2021-09-21T10:43:42Z) - Integrating Semantics and Neighborhood Information with Graph-Driven
Generative Models for Document Retrieval [51.823187647843945]
In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model.
Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones.
arXiv Detail & Related papers (2021-05-27T11:29:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.