Related papers: A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

URL: http://arxiv.org/abs/2310.09497v2
Date: Thu, 30 May 2024 10:03:27 GMT
Title: A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models
Authors: Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, Guido Zuccon,
Abstract summary: We propose a novel zero-shot document ranking approach based on Large Language Models (LLMs): the Setwise prompting approach. Our approach complements existing prompting approaches for LLM-based zero-shot ranking: Pointwise, Pairwise, and Listwise.
Score: 35.17291316942284
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We propose a novel zero-shot document ranking approach based on Large Language Models (LLMs): the Setwise prompting approach. Our approach complements existing prompting approaches for LLM-based zero-shot ranking: Pointwise, Pairwise, and Listwise. Through the first-of-its-kind comparative evaluation within a consistent experimental framework and considering factors like model size, token consumption, latency, among others, we show that existing approaches are inherently characterised by trade-offs between effectiveness and efficiency. We find that while Pointwise approaches score high on efficiency, they suffer from poor effectiveness. Conversely, Pairwise approaches demonstrate superior effectiveness but incur high computational overhead. Our Setwise approach, instead, reduces the number of LLM inferences and the amount of prompt token consumption during the ranking procedure, compared to previous methods. This significantly improves the efficiency of LLM-based zero-shot ranking, while also retaining high zero-shot ranking effectiveness. We make our code and results publicly available at \url{https://github.com/ielab/llm-rankers}.

Related papers

Direct Preference Optimization with Rating Information: Practical Algorithms and Provable Gains [67.71020482405343]
We study how to design algorithms that can leverage additional information in the form of rating gap.<n>We present new algorithms that can achieve faster statistical rates than DPO in presence of accurate rating gap information.
arXiv Detail & Related papers (2026-01-31T08:38:21Z)
ERank: Fusing Supervised Fine-Tuning and Reinforcement Learning for Effective and Efficient Text Reranking [33.25740773392183]
ERank is a highly effective and efficient pointwise reranker built from a reasoning LLM that excels across diverse relevance scenarios.<n>We propose a novel two-stage training pipeline that begins with Supervised Fine-Tuning (SFT)<n>In this stage, we move beyond binary labels and train the model generatively to output fine grained integer scores, which significantly enhances relevance discrimination.<n>We evaluate the ERank reranker on the BRIGHT, FollowIR, TREC DL, and BEIR benchmarks, demonstrating superior effectiveness and robustness compared to existing approaches.
arXiv Detail & Related papers (2025-08-30T14:56:53Z)
TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking [21.930228130429573]
Reasoning-intensive ranking models built on Large Language Models (LLMs) have made notable progress.<n>Existing approaches often rely on large-scale LLMs and explicit Chain-of-Thought (CoT) reasoning.<n>We propose textbfTFRank, an efficient pointwise reasoning ranker based on small-scale LLMs.
arXiv Detail & Related papers (2025-08-13T06:47:58Z)
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models [0.46040036610482665]
We propose Comparative Essay Scoring (LCES), a method that formulates AES as a pairwise comparison task.<n>Specifically, we instruct LLMs to judge which of two essays is better, collect many such comparisons, and convert them into continuous scores.<n>Experiments using AES benchmark datasets show that LCES outperforms conventional zero-shot methods in accuracy while maintaining computational efficiency.
arXiv Detail & Related papers (2025-05-13T12:26:16Z)
Beyond Reproducibility: Advancing Zero-shot LLM Reranking Efficiency with Setwise Insertion [0.0]
This study presents a comprehensive and extension analysis of the Setwise prompting methodology for zero-shot ranking with Large Language Models (LLMs) We evaluate its effectiveness and efficiency compared to traditional Pointwise, Pairwise, and Listwise approaches in document ranking tasks. We introduce Setwise Insertion, a novel approach that leverages the initial document ranking as prior knowledge.
arXiv Detail & Related papers (2025-04-09T18:44:34Z)
Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach. This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets. We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z)
Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment [81.84950252537618]
This paper reveals a unified game-theoretic connection between iterative BOND and self-play alignment. We establish a novel framework, WIN rate Dominance (WIND), with a series of efficient algorithms for regularized win rate dominance optimization.
arXiv Detail & Related papers (2024-10-28T04:47:39Z)
Efficient Pointwise-Pairwise Learning-to-Rank for News Recommendation [6.979979613916754]
News recommendation is a challenging task that involves personalization based on the interaction history and preferences of each user. Recent works have leveraged the power of pretrained language models (PLMs) to directly rank news items by using inference approaches that predominately fall into three categories: pointwise, pairwise, and listwise learning-to-rank. We propose a novel framework for PLM-based news recommendation that integrates both pointwise relevance prediction and pairwise comparisons in a scalable manner.
arXiv Detail & Related papers (2024-09-26T10:27:19Z)
On Speeding Up Language Model Evaluation [48.51924035873411]
Development of prompt-based methods with Large Language Models (LLMs) requires making numerous decisions. We propose a novel method to address this challenge. We show that it can identify the top-performing method using only 5-15% of the typically needed resources.
arXiv Detail & Related papers (2024-07-08T17:48:42Z)
Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models [17.420756201557957]
We propose PE-Rank, leveraging the single passage embedding as a good context compression for efficient listwise passage reranking. We introduce an inference method that dynamically constrains the decoding space to these special tokens, accelerating the decoding process. Results on multiple benchmarks demonstrate that PE-Rank significantly improves efficiency in both prefilling and decoding, while maintaining competitive ranking effectiveness.
arXiv Detail & Related papers (2024-06-21T03:33:51Z)
Instruction Distillation Makes Large Language Models Efficient Zero-shot Rankers [56.12593882838412]
We introduce a novel instruction distillation method to rank documents. We first rank documents using the effective pairwise approach with complex instructions, and then distill the teacher predictions to the pointwise approach with simpler instructions. Our approach surpasses the performance of existing supervised methods like monoT5 and is on par with the state-of-the-art zero-shot methods.
arXiv Detail & Related papers (2023-11-02T19:16:21Z)
LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking [10.671747198171136]
We propose a two-stage framework using large language models for ranking-based recommendation (LlamaRec) In particular, we use small-scale sequential recommenders to retrieve candidates based on the user interaction history. LlamaRec consistently achieves datasets superior performance in both recommendation performance and efficiency.
arXiv Detail & Related papers (2023-10-25T06:23:48Z)
Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting [65.00288634420812]
Pairwise Ranking Prompting (PRP) is a technique to significantly reduce the burden on Large Language Models (LLMs) Our results are the first in the literature to achieve state-of-the-art ranking performance on standard benchmarks using moderate-sized open-sourced LLMs.
arXiv Detail & Related papers (2023-06-30T11:32:25Z)
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond [135.8013388183257]
We propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits. Most LLMs struggle on SummEdits, with performance close to random chance. The best-performing model, GPT-4, is still 8% below estimated human performance.
arXiv Detail & Related papers (2023-05-23T21:50:06Z)
Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem. Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem. We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.