Related papers: Learning Neural Ranking Models Online from Implicit User Feedback

Learning Neural Ranking Models Online from Implicit User Feedback

URL: http://arxiv.org/abs/2201.06658v1
Date: Mon, 17 Jan 2022 23:11:39 GMT
Title: Learning Neural Ranking Models Online from Implicit User Feedback
Authors: Yiling Jia, Hongning Wang
Abstract summary: We propose to learn a neural ranking model from users' implicit feedback (e.g., clicks) collected on the fly. We focus on RankNet and LambdaRank, due to their great empirical success and wide adoption in offline settings.
Score: 40.40829575021796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing online learning to rank (OL2R) solutions are limited to linear models, which are incompetent to capture possible non-linear relations between queries and documents. In this work, to unleash the power of representation learning in OL2R, we propose to directly learn a neural ranking model from users' implicit feedback (e.g., clicks) collected on the fly. We focus on RankNet and LambdaRank, due to their great empirical success and wide adoption in offline settings, and control the notorious explore-exploit trade-off based on the convergence analysis of neural networks using neural tangent kernel. Specifically, in each round of result serving, exploration is only performed on document pairs where the predicted rank order between the two documents is uncertain; otherwise, the ranker's predicted order will be followed in result ranking. We prove that under standard assumptions our OL2R solution achieves a gap-dependent upper regret bound of $O(\log^2(T))$, in which the regret is defined on the total number of mis-ordered pairs over $T$ rounds. Comparisons against an extensive set of state-of-the-art OL2R baselines on two public learning to rank benchmark datasets demonstrate the effectiveness of the proposed solution.

Related papers

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Best Policy Learning from Trajectory Preference Feedback [15.799929216215672]
We address the problem of best policy identification in preference-based reinforcement learning (PbRL) We propose Posterior Sampling for Preference Learning ($mathsfPSPL$), a novel algorithm inspired by Top-Two Thompson Sampling. We provide the first theoretical guarantees for PbRL in this setting, establishing an upper bound on the simple Bayesian regret.
arXiv Detail & Related papers (2025-01-31T03:55:10Z)
Generative Pre-trained Ranking Model with Over-parameterization at Web-Scale (Extended Abstract) [73.57710917145212]
Learning to rank is widely employed in web searches to prioritize pertinent webpages based on input queries. We propose a emphulineGenerative ulineSemi-ulineSupervised ulinePre-trained (GS2P) model to address these challenges. We conduct extensive offline experiments on both a publicly available dataset and a real-world dataset collected from a large-scale search engine.
arXiv Detail & Related papers (2024-09-25T03:39:14Z)
Online Bandit Learning with Offline Preference Data [15.799929216215672]
We propose a posterior sampling algorithm for online learning that can be warm-started with an offline dataset with noisy preference feedback. We show that by modeling the 'competence' of the expert that generated it, we are able to use such a dataset most effectively.
arXiv Detail & Related papers (2024-06-13T20:25:52Z)
Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks. We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z)
GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks [68.61934077627085]
We introduce GNNRank, a modeling framework compatible with any GNN capable of learning digraph embeddings. We show that our methods attain competitive and often superior performance compared with existing approaches.
arXiv Detail & Related papers (2022-02-01T04:19:50Z)
Calibrating Explore-Exploit Trade-off for Fair Online Learning to Rank [38.28889079095716]
Online learning to rank (OL2R) has attracted great research interests in recent years. We propose a general framework to achieve fairness defined by group exposure in OL2R. In particular, when the model is exploring a set of results for relevance feedback, we confine the exploration within a subset of random permutations.
arXiv Detail & Related papers (2021-11-01T07:22:05Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer [35.199462901346706]
We propose to estimate a pairwise learning to rank model online. In each round, candidate documents are partitioned and ranked according to the model's confidence on the estimated pairwise rank order. Regret directly defined on the number of mis-ordered pairs is proven, which connects the online solution's theoretical convergence with its expected ranking performance.
arXiv Detail & Related papers (2021-02-28T01:16:55Z)
L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system. A novel $L2R2$ approach is proposed under the learning-to-rank framework. Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z)
Unbiased Learning to Rank: Online or Offline? [28.431648823968278]
How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank can be broadly categorized into two groups -- the studies on unbiased learning algorithms with logged data, and the studies on unbiased parameters estimation with real-time user interactions. This paper formalizes the task of unbiased learning to rank and shows that existing algorithms for offline unbiased learning and online learning to rank are just the two sides of the same coin.
arXiv Detail & Related papers (2020-04-28T15:01:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.