Related papers: Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information

Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information

URL: http://arxiv.org/abs/2506.10859v1
Date: Thu, 12 Jun 2025 16:20:40 GMT
Title: Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information
Authors: Kehan Long, Shasha Li, Chen Xu, Jintao Tang, Ting Wang,
Abstract summary: We propose a novel Global-Consistent Comparative Pointwise Ranking (GCCP) strategy.<n>This strategy incorporates global reference comparisons between each candidate and an anchor document to generate contrastive relevance scores.<n>Our approach significantly outperforms previous pointwise methods while maintaining comparable efficiency.
Score: 14.302737287907274
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements have successfully harnessed the power of Large Language Models (LLMs) for zero-shot document ranking, exploring a variety of prompting strategies. Comparative approaches like pairwise and listwise achieve high effectiveness but are computationally intensive and thus less practical for larger-scale applications. Scoring-based pointwise approaches exhibit superior efficiency by independently and simultaneously generating the relevance scores for each candidate document. However, this independence ignores critical comparative insights between documents, resulting in inconsistent scoring and suboptimal performance. In this paper, we aim to improve the effectiveness of pointwise methods while preserving their efficiency through two key innovations: (1) We propose a novel Global-Consistent Comparative Pointwise Ranking (GCCP) strategy that incorporates global reference comparisons between each candidate and an anchor document to generate contrastive relevance scores. We strategically design the anchor document as a query-focused summary of pseudo-relevant candidates, which serves as an effective reference point by capturing the global context for document comparison. (2) These contrastive relevance scores can be efficiently Post-Aggregated with existing pointwise methods, seamlessly integrating essential Global Context information in a training-free manner (PAGC). Extensive experiments on the TREC DL and BEIR benchmark demonstrate that our approach significantly outperforms previous pointwise methods while maintaining comparable efficiency. Our method also achieves competitive performance against comparative methods that require substantially more computational resources. More analyses further validate the efficacy of our anchor construction strategy.

Related papers

Leveraging Reference Documents for Zero-Shot Ranking via Large Language Models [16.721450557704767]
RefRank is a simple and effective comparative ranking method based on a fixed reference document.<n>We show that RefRank significantly outperforms Pointwise baselines and could achieve performance at least on par with Pairwise approaches.
arXiv Detail & Related papers (2025-06-13T04:03:09Z)
Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach. This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets. We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z)
Efficient Pointwise-Pairwise Learning-to-Rank for News Recommendation [6.979979613916754]
News recommendation is a challenging task that involves personalization based on the interaction history and preferences of each user. Recent works have leveraged the power of pretrained language models (PLMs) to directly rank news items by using inference approaches that predominately fall into three categories: pointwise, pairwise, and listwise learning-to-rank. We propose a novel framework for PLM-based news recommendation that integrates both pointwise relevance prediction and pairwise comparisons in a scalable manner.
arXiv Detail & Related papers (2024-09-26T10:27:19Z)
A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data. We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z)
A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models [35.17291316942284]
We propose a novel zero-shot document ranking approach based on Large Language Models (LLMs): the Setwise prompting approach. Our approach complements existing prompting approaches for LLM-based zero-shot ranking: Pointwise, Pairwise, and Listwise.
arXiv Detail & Related papers (2023-10-14T05:20:02Z)
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models [55.60306377044225]
Large language models (LLMs) have enabled impressive zero-shot capabilities across various natural language tasks. This paper explores two options for exploiting the emergent abilities of LLMs for zero-shot NLG assessment. For moderate-sized open-source LLMs, such as FlanT5 and Llama2-chat, comparative assessment is superior to prompt scoring.
arXiv Detail & Related papers (2023-07-15T22:02:12Z)
Evaluating and Improving Factuality in Multimodal Abstractive Summarization [91.46015013816083]
We propose CLIPBERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary. We show that this simple combination of two metrics in the zero-shot achieves higher correlations than existing factuality metrics for document summarization. Our analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks.
arXiv Detail & Related papers (2022-11-04T16:50:40Z)
Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively. We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z)
Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects. Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency. We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z)
Strategy for Boosting Pair Comparison and Improving Quality Assessment Accuracy [29.849156371902943]
Pair Comparison (PC) is of significant advantage over Absolute Category Rating (ACR) in terms of discriminability. In this study, we employ a generic model to bridge the pair comparison data and ACR data, where the variance term could be recovered and the obtained information is more complete. In such a way, the proposed methodology could achieve the same accuracy of pair comparison but with the compelxity as low as ACR.
arXiv Detail & Related papers (2020-10-01T13:05:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.