RRADistill: Distilling LLMs' Passage Ranking Ability for Long-Tail Queries Document Re-Ranking on a Search Engine
- URL: http://arxiv.org/abs/2410.18097v3
- Date: Thu, 21 Nov 2024 14:23:49 GMT
- Title: RRADistill: Distilling LLMs' Passage Ranking Ability for Long-Tail Queries Document Re-Ranking on a Search Engine
- Authors: Nayoung Choi, Youngjune Lee, Gyu-Hwung Cho, Haeyu Jeong, Jungmin Kong, Saehun Kim, Keunchan Park, Sarah Cho, Inchang Jeong, Gyohee Nam, Sunghoon Han, Wonil Yang, Jaeho Choi,
- Abstract summary: Large Language Models (LLMs) excel at understanding the semantic relationships between queries and documents.
These queries are challenging for feedback-based rankings due to sparse user engagement and limited feedback.
We propose an efficient label generation pipeline and novel sLLM training methods for both encoder and decoder models.
- Score: 2.0379810233726126
- License:
- Abstract: Large Language Models (LLMs) excel at understanding the semantic relationships between queries and documents, even with lengthy and complex long-tail queries. These queries are challenging for feedback-based rankings due to sparse user engagement and limited feedback, making LLMs' ranking ability highly valuable. However, the large size and slow inference of LLMs necessitate the development of smaller, more efficient models (sLLMs). Recently, integrating ranking label generation into distillation techniques has become crucial, but existing methods underutilize LLMs' capabilities and are cumbersome. Our research, RRADistill: Re-Ranking Ability Distillation, propose an efficient label generation pipeline and novel sLLM training methods for both encoder and decoder models. We introduce an encoder-based method using a Term Control Layer to capture term matching signals and a decoder-based model with a ranking layer for enhanced understanding. A/B testing on a Korean-based search platform, validates the effectiveness of our approach in improving re-ranking for long-tail queries.
Related papers
- Best Practices for Distilling Large Language Models into BERT for Web Search Ranking [14.550458167328497]
Large Language Models (LLMs) can generate a ranked list of potential documents.
We transfer the ranking expertise of LLMs to a more compact model like BERT, using a ranking loss to enable the deployment of less resource-intensive models.
Our model has been successfully integrated into a commercial web search engine as of February 2024.
arXiv Detail & Related papers (2024-11-07T08:54:46Z) - FIRST: Faster Improved Listwise Reranking with Single Token Decoding [56.727761901751194]
First, we introduce FIRST, a novel listwise LLM reranking approach leveraging the output logits of the first generated identifier to directly obtain a ranked ordering of the candidates.
Empirical results demonstrate that FIRST accelerates inference by 50% while maintaining a robust ranking performance with gains across the BEIR benchmark.
Our results show that LLM rerankers can provide a stronger distillation signal compared to cross-encoders, yielding substantial improvements in retriever recall after relevance feedback.
arXiv Detail & Related papers (2024-06-21T21:27:50Z) - A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking [79.35822270532948]
Cross-encoders distilled from large language models (LLMs) are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data.
We construct and release a new distillation dataset: Rank-DistiLLM.
arXiv Detail & Related papers (2024-05-13T16:51:53Z) - Improving Retrieval for RAG based Question Answering Models on Financial Documents [0.046603287532620746]
This paper explores the existing constraints of RAG pipelines and introduces methodologies for enhancing text retrieval.
It delves into strategies such as sophisticated chunking techniques, query expansion, the incorporation of metadata annotations, the application of re-ranking algorithms, and the fine-tuning of embedding algorithms.
arXiv Detail & Related papers (2024-03-23T00:49:40Z) - A Review of Repository Level Prompting for LLMs [0.0]
Large Language Models (LLMs) have led to notable successes, such as achieving a 94.6% solve rate on the HumanEval benchmark.
There is an increasing commercial push for repository-level inline code completion tools, such as GitHub Copilot and Tab Nine.
This paper delves into the transition from individual coding problems to repository-scale solutions.
arXiv Detail & Related papers (2023-12-15T00:34:52Z) - DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for
In-Context Learning [66.85379279041128]
In this study, we introduce a framework that leverages Dual Queries and Low-rank approximation Re-ranking to automatically select exemplars for in-context learning.
DQ-LoRe significantly outperforms prior state-of-the-art methods in the automatic selection of exemplars for GPT-4, enhancing performance from 92.5% to 94.2%.
arXiv Detail & Related papers (2023-10-04T16:44:37Z) - From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets.
Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data.
Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.