Learning Neural Ranking Models Online from Implicit User Feedback
- URL: http://arxiv.org/abs/2201.06658v1
- Date: Mon, 17 Jan 2022 23:11:39 GMT
- Title: Learning Neural Ranking Models Online from Implicit User Feedback
- Authors: Yiling Jia, Hongning Wang
- Abstract summary: We propose to learn a neural ranking model from users' implicit feedback (e.g., clicks) collected on the fly.
We focus on RankNet and LambdaRank, due to their great empirical success and wide adoption in offline settings.
- Score: 40.40829575021796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing online learning to rank (OL2R) solutions are limited to linear
models, which are incompetent to capture possible non-linear relations between
queries and documents. In this work, to unleash the power of representation
learning in OL2R, we propose to directly learn a neural ranking model from
users' implicit feedback (e.g., clicks) collected on the fly. We focus on
RankNet and LambdaRank, due to their great empirical success and wide adoption
in offline settings, and control the notorious explore-exploit trade-off based
on the convergence analysis of neural networks using neural tangent kernel.
Specifically, in each round of result serving, exploration is only performed on
document pairs where the predicted rank order between the two documents is
uncertain; otherwise, the ranker's predicted order will be followed in result
ranking. We prove that under standard assumptions our OL2R solution achieves a
gap-dependent upper regret bound of $O(\log^2(T))$, in which the regret is
defined on the total number of mis-ordered pairs over $T$ rounds. Comparisons
against an extensive set of state-of-the-art OL2R baselines on two public
learning to rank benchmark datasets demonstrate the effectiveness of the
proposed solution.
Related papers
- Generative Pre-trained Ranking Model with Over-parameterization at Web-Scale (Extended Abstract) [73.57710917145212]
Learning to rank is widely employed in web searches to prioritize pertinent webpages based on input queries.
We propose a emphulineGenerative ulineSemi-ulineSupervised ulinePre-trained (GS2P) model to address these challenges.
We conduct extensive offline experiments on both a publicly available dataset and a real-world dataset collected from a large-scale search engine.
arXiv Detail & Related papers (2024-09-25T03:39:14Z) - Online Bandit Learning with Offline Preference Data [15.799929216215672]
We propose a posterior sampling algorithm for online learning that can be warm-started with an offline dataset with noisy preference feedback.
We show that by modeling the 'competence' of the expert that generated it, we are able to use such a dataset most effectively.
arXiv Detail & Related papers (2024-06-13T20:25:52Z) - Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks.
We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z) - GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed
Graph Neural Networks [68.61934077627085]
We introduce GNNRank, a modeling framework compatible with any GNN capable of learning digraph embeddings.
We show that our methods attain competitive and often superior performance compared with existing approaches.
arXiv Detail & Related papers (2022-02-01T04:19:50Z) - Calibrating Explore-Exploit Trade-off for Fair Online Learning to Rank [38.28889079095716]
Online learning to rank (OL2R) has attracted great research interests in recent years.
We propose a general framework to achieve fairness defined by group exposure in OL2R.
In particular, when the model is exploring a set of results for relevance feedback, we confine the exploration within a subset of random permutations.
arXiv Detail & Related papers (2021-11-01T07:22:05Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer [35.199462901346706]
We propose to estimate a pairwise learning to rank model online.
In each round, candidate documents are partitioned and ranked according to the model's confidence on the estimated pairwise rank order.
Regret directly defined on the number of mis-ordered pairs is proven, which connects the online solution's theoretical convergence with its expected ranking performance.
arXiv Detail & Related papers (2021-02-28T01:16:55Z) - L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system.
A novel $L2R2$ approach is proposed under the learning-to-rank framework.
Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z) - Unbiased Learning to Rank: Online or Offline? [28.431648823968278]
How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR.
Existing work on unbiased learning to rank can be broadly categorized into two groups -- the studies on unbiased learning algorithms with logged data, and the studies on unbiased parameters estimation with real-time user interactions.
This paper formalizes the task of unbiased learning to rank and shows that existing algorithms for offline unbiased learning and online learning to rank are just the two sides of the same coin.
arXiv Detail & Related papers (2020-04-28T15:01:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.