Related papers: Unsupervised Contrast-Consistent Ranking with Language Models

Unsupervised Contrast-Consistent Ranking with Language Models

URL: http://arxiv.org/abs/2309.06991v2
Date: Sat, 3 Feb 2024 05:52:02 GMT
Title: Unsupervised Contrast-Consistent Ranking with Language Models
Authors: Niklas Stoehr, Pengxiang Cheng, Jing Wang, Daniel Preotiuc-Pietro, Rajarshi Bhowmik
Abstract summary: Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model's ranking knowledge. We find that even with careful calibration and constrained decoding, prompting-based techniques may not always be self-consistent in the rankings they produce.
Score: 24.696017700382665
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank product reviews by sentiment. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model's ranking knowledge. However, we find that even with careful calibration and constrained decoding, prompting-based techniques may not always be self-consistent in the rankings they produce. This motivates us to explore an alternative approach that is inspired by an unsupervised probing method called Contrast-Consistent Search (CCS). The idea is to train a probe guided by a logical constraint: a language model's representation of a statement and its negation must be mapped to contrastive true-false poles consistently across multiple statements. We hypothesize that similar constraints apply to ranking tasks where all items are related via consistent, pairwise or listwise comparisons. To this end, we extend the binary CCS method to Contrast-Consistent Ranking (CCR) by adapting existing ranking methods such as the Max-Margin Loss, Triplet Loss and an Ordinal Regression objective. Across different models and datasets, our results confirm that CCR probing performs better or, at least, on a par with prompting.

Related papers

Attack-in-the-Chain: Bootstrapping Large Language Models for Attacks Against Black-box Neural Ranking Models [111.58315434849047]
We introduce a novel ranking attack framework named Attack-in-the-Chain. It tracks interactions between large language models (LLMs) and Neural ranking models (NRMs) based on chain-of-thought. Empirical results on two web search benchmarks show the effectiveness of our method.
arXiv Detail & Related papers (2024-12-25T04:03:09Z)
TSPRank: Bridging Pairwise and Listwise Methods with a Bilinear Travelling Salesman Model [19.7255072094322]
Travelling Salesman Problem Rank (TSPRank) is a hybrid pairwise-listwise ranking method. TSPRank's robustness and superior performance across different domains highlight its potential as a versatile and effective LETOR solution.
arXiv Detail & Related papers (2024-11-18T21:10:14Z)
Sifting through the Chaff: On Utilizing Execution Feedback for Ranking the Generated Code Candidates [46.74037090843497]
Large Language Models (LLMs) are transforming the way developers approach programming by automatically generating code based on natural language descriptions. This paper puts forward RankEF, an innovative approach for code ranking that leverages execution feedback. Experiments on three code generation benchmarks demonstrate that RankEF significantly outperforms the state-of-the-art CodeRanker.
arXiv Detail & Related papers (2024-08-26T01:48:57Z)
LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking [17.96316956366718]
Ranking passages by prompting a large language model (LLM) can achieve promising performance in modern information retrieval (IR) systems. We show that sorting-based methods require consistent comparisons to correctly sort the passages, which we show that LLMs often violate. We propose LLM-RankFusion, an LLM-based ranking framework that mitigates these inconsistencies and produces a robust ranking list.
arXiv Detail & Related papers (2024-05-31T23:29:42Z)
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models [63.714662435555674]
Large language models (LLMs) exhibit positional bias in how they use context. We propose permutation self-consistency, a form of self-consistency over ranking list outputs of black-box LLMs. Our approach improves scores from conventional inference by up to 7-18% for GPT-3.5 and 8-16% for LLaMA v2 (70B)
arXiv Detail & Related papers (2023-10-11T17:59:02Z)
Replace Scoring with Arrangement: A Contextual Set-to-Arrangement Framework for Learning-to-Rank [40.81502990315285]
Learning-to-rank is a core technique in the top-N recommendation task, where an ideal ranker would be a mapping from an item set to an arrangement. Most existing solutions fall in the paradigm of probabilistic ranking principle (PRP), i.e., first score each item in the candidate set and then perform a sort operation to generate the top ranking list. We propose Set-To-Arrangement Ranking (STARank), a new framework directly generates the permutations of the candidate items without the need for individually scoring and sort operations.
arXiv Detail & Related papers (2023-08-05T12:22:26Z)
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning. It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework. An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z)
Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data. Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z)
Discovering Non-monotonic Autoregressive Orderings with Variational Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z)
PiRank: Learning To Rank via Differentiable Sorting [85.28916333414145]
We propose PiRank, a new class of differentiable surrogates for ranking. We show that PiRank exactly recovers the desired metrics in the limit of zero temperature.
arXiv Detail & Related papers (2020-12-12T05:07:36Z)
CycAs: Self-supervised Cycle Association for Learning Re-identifiable Descriptions [61.724894233252414]
This paper proposes a self-supervised learning method for the person re-identification (re-ID) problem. Existing unsupervised methods usually rely on pseudo labels, such as those from video tracklets or clustering. We introduce a different unsupervised method that allows us to learn pedestrian embeddings from raw videos, without resorting to pseudo labels.
arXiv Detail & Related papers (2020-07-15T09:52:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.