Related papers: RankFormer: Listwise Learning-to-Rank Using Listwide Labels

RankFormer: Listwise Learning-to-Rank Using Listwide Labels

URL: http://arxiv.org/abs/2306.05808v1
Date: Fri, 9 Jun 2023 10:47:06 GMT
Title: RankFormer: Listwise Learning-to-Rank Using Listwide Labels
Authors: Maarten Buyl, Paul Missault and Pierre-Antoine Sondag
Abstract summary: We propose the RankFormer as an architecture that can jointly optimize a novel listwide assessment objective and a traditional listwise objective. We conduct experiments in e-commerce on Amazon Search data and find the RankFormer to be superior to all baselines offline.
Score: 2.9005223064604078
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Web applications where users are presented with a limited selection of items have long employed ranking models to put the most relevant results first. Any feedback received from users is typically assumed to reflect a relative judgement on the utility of items, e.g. a user clicking on an item only implies it is better than items not clicked in the same ranked list. Hence, the objectives optimized in Learning-to-Rank (LTR) tend to be pairwise or listwise. Yet, by only viewing feedback as relative, we neglect the user's absolute feedback on the list's overall quality, e.g. when no items in the selection are clicked. We thus reconsider the standard LTR paradigm and argue the benefits of learning from this listwide signal. To this end, we propose the RankFormer as an architecture that, with a Transformer at its core, can jointly optimize a novel listwide assessment objective and a traditional listwise LTR objective. We simulate implicit feedback on public datasets and observe that the RankFormer succeeds in benefitting from listwide signals. Additionally, we conduct experiments in e-commerce on Amazon Search data and find the RankFormer to be superior to all baselines offline. An online experiment shows that knowledge distillation can be used to find immediate practical use for the RankFormer.

Related papers

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Can Large Language Models Understand Preferences in Personalized Recommendation? [32.2250928311146]
We introduce PerRecBench, disassociating evaluation from user rating bias and item quality. We find that the LLM-based recommendation techniques that are generally good at rating prediction fail to identify users' favored and disfavored items when the user rating bias and item quality are eliminated. Our findings reveal the superiority of pairwise and listwise ranking approaches over pointwise ranking, PerRecBench's low correlation with traditional regression metrics, the importance of user profiles, and the role of pretraining data distributions.
arXiv Detail & Related papers (2025-01-23T05:24:18Z)
Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach. This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets. We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z)
Beyond Positive History: Re-ranking with List-level Hybrid Feedback [49.52149227298746]
We propose Re-ranking with List-level Hybrid Feedback (dubbed RELIFE) It captures user's preferences and behavior patterns with three modules. Experiments show that RELIFE significantly outperforms SOTA re-ranking baselines.
arXiv Detail & Related papers (2024-10-28T06:39:01Z)
LiPO: Listwise Preference Optimization through Learning-to-Rank [62.02782819559389]
Policy can learn more effectively from a ranked list of plausible responses given the prompt. We show that LiPO-$lambda$ can outperform DPO variants and SLiC by a clear margin on several preference alignment tasks.
arXiv Detail & Related papers (2024-02-02T20:08:10Z)
Replace Scoring with Arrangement: A Contextual Set-to-Arrangement Framework for Learning-to-Rank [40.81502990315285]
Learning-to-rank is a core technique in the top-N recommendation task, where an ideal ranker would be a mapping from an item set to an arrangement. Most existing solutions fall in the paradigm of probabilistic ranking principle (PRP), i.e., first score each item in the candidate set and then perform a sort operation to generate the top ranking list. We propose Set-To-Arrangement Ranking (STARank), a new framework directly generates the permutations of the candidate items without the need for individually scoring and sort operations.
arXiv Detail & Related papers (2023-08-05T12:22:26Z)
PEAR: Personalized Re-ranking with Contextualized Transformer for Recommendation [48.17295872384401]
We present a personalized re-ranking model (dubbed PEAR) based on contextualized transformer. PEAR makes several major improvements over the existing methods. We also augment the training of PEAR with a list-level classification task to assess users' satisfaction on the whole ranking list.
arXiv Detail & Related papers (2022-03-23T08:29:46Z)
Online Learning of Optimally Diverse Rankings [63.62764375279861]
We propose an algorithm that efficiently learns the optimal list based on users' feedback only. We show that after $T$ queries, the regret of LDR scales as $O((N-L)log(T))$ where $N$ is the number of all items.
arXiv Detail & Related papers (2021-09-13T12:13:20Z)
Set2setRank: Collaborative Set to Set Ranking for Implicit Feedback based Recommendation [59.183016033308014]
In this paper, we explore the unique characteristics of the implicit feedback and propose Set2setRank framework for recommendation. Our proposed framework is model-agnostic and can be easily applied to most recommendation prediction approaches.
arXiv Detail & Related papers (2021-05-16T08:06:22Z)
Controlling Fairness and Bias in Dynamic Learning-to-Rank [31.41843594914603]
We propose a learning algorithm that ensures notions of amortized group fairness, while simultaneously learning the ranking function from implicit feedback data. The algorithm takes the form of a controller that integrates unbiased estimators for both fairness and utility. In addition to its rigorous theoretical foundation and convergence guarantees, we find empirically that the algorithm is highly practical and robust.
arXiv Detail & Related papers (2020-05-29T17:57:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.