Related papers: FIRST: Faster Improved Listwise Reranking with Single Token Decoding

FIRST: Faster Improved Listwise Reranking with Single Token Decoding

URL: http://arxiv.org/abs/2406.15657v1
Date: Fri, 21 Jun 2024 21:27:50 GMT
Title: FIRST: Faster Improved Listwise Reranking with Single Token Decoding
Authors: Revanth Gangi Reddy, JaeHyeok Doo, Yifei Xu, Md Arafat Sultan, Deevya Swain, Avirup Sil, Heng Ji,
Abstract summary: First, we introduce FIRST, a novel listwise LLM reranking approach leveraging the output logits of the first generated identifier to directly obtain a ranked ordering of the candidates. Empirical results demonstrate that FIRST accelerates inference by 50% while maintaining a robust ranking performance with gains across the BEIR benchmark. Our results show that LLM rerankers can provide a stronger distillation signal compared to cross-encoders, yielding substantial improvements in retriever recall after relevance feedback.
Score: 56.727761901751194
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have significantly advanced the field of information retrieval, particularly for reranking. Listwise LLM rerankers have showcased superior performance and generalizability compared to existing supervised approaches. However, conventional listwise LLM reranking methods lack efficiency as they provide ranking output in the form of a generated ordered sequence of candidate passage identifiers. Further, they are trained with the typical language modeling objective, which treats all ranking errors uniformly--potentially at the cost of misranking highly relevant passages. Addressing these limitations, we introduce FIRST, a novel listwise LLM reranking approach leveraging the output logits of the first generated identifier to directly obtain a ranked ordering of the candidates. Further, we incorporate a learning-to-rank loss during training, prioritizing ranking accuracy for the more relevant passages. Empirical results demonstrate that FIRST accelerates inference by 50% while maintaining a robust ranking performance with gains across the BEIR benchmark. Finally, to illustrate the practical effectiveness of listwise LLM rerankers, we investigate their application in providing relevance feedback for retrievers during inference. Our results show that LLM rerankers can provide a stronger distillation signal compared to cross-encoders, yielding substantial improvements in retriever recall after relevance feedback.

Related papers

REALM: Recursive Relevance Modeling for LLM-based Document Re-Ranking [7.183085245898039]
Large Language Models (LLMs) have shown strong capabilities in document re-ranking.<n>We propose REALM, an uncertainty-aware re-ranking framework.<n>We show it surpasses state-of-the-art re-rankers while significantly reducing token usage and latency.
arXiv Detail & Related papers (2025-08-25T18:13:50Z)
ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking [8.244386008877441]
We introduce a novel two-stage training approach, ProRank, for SLM-based document reranking.<n>First, we propose a prompt warmup stage using reinforcement learning GRPO to steer SLMs to understand task prompts.<n>Then, we continuously fine-tune the SLMs with a fine-grained score learning stage without introducing additional layers to further improve the reranking quality.
arXiv Detail & Related papers (2025-06-04T02:00:44Z)
CoRanking: Collaborative Ranking with Small and Large Ranking Agents [39.98101653077503]
Large Language Models (LLMs) have demonstrated superior listwise ranking performance. CoRanking combines small and large ranking models for efficient and effective ranking.
arXiv Detail & Related papers (2025-03-30T13:00:52Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation [43.630437906898635]
We propose a novel two-stage fine-tuning architecture called Invar-RAG. In the retrieval stage, an LLM-based retriever is constructed by integrating LoRA-based representation learning. In the generation stage, a refined fine-tuning method is employed to improve LLM accuracy in generating answers based on retrieved information.
arXiv Detail & Related papers (2024-11-11T14:25:37Z)
An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking [50.81324768683995]
FIRST is a novel approach that integrates a learning-to-rank objective and leveraging the logits of only the first generated token. We extend the evaluation of FIRST to the TREC Deep Learning datasets (DL19-22), validating its robustness across diverse domains. Our experiments confirm that fast reranking with single-token logits does not compromise out-of-domain reranking quality.
arXiv Detail & Related papers (2024-11-08T12:08:17Z)
Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach. This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets. We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z)
Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG [36.754491649652664]
Retrieval-augmented generation (RAG) empowers large language models (LLMs) to utilize external knowledge sources. This paper investigates the detrimental impact of retrieved "hard negatives" as a key contributor. To mitigate this and enhance the robustness of long-context LLM-based RAG, we propose both training-free and training-based approaches.
arXiv Detail & Related papers (2024-10-08T12:30:07Z)
Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models [17.420756201557957]
We propose PE-Rank, leveraging the single passage embedding as a good context compression for efficient listwise passage reranking. We introduce an inference method that dynamically constrains the decoding space to these special tokens, accelerating the decoding process. Results on multiple benchmarks demonstrate that PE-Rank significantly improves efficiency in both prefilling and decoding, while maintaining competitive ranking effectiveness.
arXiv Detail & Related papers (2024-06-21T03:33:51Z)
Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score. Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score. Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z)
Make Large Language Model a Better Ranker [20.532118635672763]
This paper introduces the large language model framework with Aligned Listwise Ranking Objectives (ALRO) ALRO is designed to bridge the gap between the capabilities of LLMs and nuanced requirements of ranking tasks. Our evaluative studies reveal that ALRO outperforms both existing embedding-based recommendation methods and LLM-based recommendation baselines.
arXiv Detail & Related papers (2024-03-28T07:22:16Z)
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning [79.32236399694077]
Low-quality data in the training set are usually detrimental to instruction tuning. We propose a novel method, termed "reflection-tuning" This approach utilizes an oracle LLM to recycle the original training data by introspecting and enhancing the quality of instructions and responses in the data.
arXiv Detail & Related papers (2023-10-18T05:13:47Z)
Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data. Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.