Scalar is Not Enough: Vectorization-based Unbiased Learning to Rank
- URL: http://arxiv.org/abs/2206.01702v1
- Date: Fri, 3 Jun 2022 17:23:25 GMT
- Title: Scalar is Not Enough: Vectorization-based Unbiased Learning to Rank
- Authors: Mouxiang Chen, Chenghao Liu, Zemin Liu, Jianling Sun
- Abstract summary: Unbiased learning to rank aims to train an unbiased ranking model from biased user click logs.
Most of the current ULTR methods are based on the examination hypothesis (EH), which assumes that the click probability can be factorized into two scalar functions.
We propose a vector-based EH and formulate the click probability as a dot product of two vector functions.
- Score: 29.934700345584726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unbiased learning to rank (ULTR) aims to train an unbiased ranking model from
biased user click logs. Most of the current ULTR methods are based on the
examination hypothesis (EH), which assumes that the click probability can be
factorized into two scalar functions, one related to ranking features and the
other related to bias factors. Unfortunately, the interactions among features,
bias factors and clicks are complicated in practice, and usually cannot be
factorized in this independent way. Fitting click data with EH could lead to
model misspecification and bring the approximation error.
In this paper, we propose a vector-based EH and formulate the click
probability as a dot product of two vector functions. This solution is complete
due to its universality in fitting arbitrary click functions. Based on it, we
propose a novel model named Vectorization to adaptively learn the relevance
embeddings and sort documents by projecting embeddings onto a base vector.
Extensive experiments show that our method significantly outperforms the
state-of-the-art ULTR methods on complex real clicks as well as simple
simulated clicks.
Related papers
- Contextual Dual Learning Algorithm with Listwise Distillation for Unbiased Learning to Rank [26.69630281310365]
Unbiased Learning to Rank (ULTR) aims to leverage biased implicit user feedback (e.g., click) to optimize an unbiased ranking model.
We propose a Contextual Dual Learning Algorithm with Listwise Distillation (CDLA-LD) to address both position bias and contextual bias.
arXiv Detail & Related papers (2024-08-19T09:13:52Z) - Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.
This paper investigates the robustness of existing CLTR models in complex and diverse situations.
We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z) - Unified Off-Policy Learning to Rank: a Reinforcement Learning
Perspective [61.4025671743675]
Off-policy learning to rank methods often make strong assumptions about how users generate the click data.
We show that offline reinforcement learning can adapt to various click models without complex debiasing techniques and prior knowledge of the model.
Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms.
arXiv Detail & Related papers (2023-06-13T03:46:22Z) - FE-TCM: Filter-Enhanced Transformer Click Model for Web Search [10.91456636784484]
We use Transformer as the backbone network of feature extraction, add filter layer innovatively, and propose a new Filter-Enhanced Transformer Click Model (FE-TCM) for web search.
FE-TCM outperforms the existing click models for the click prediction.
arXiv Detail & Related papers (2023-01-19T02:51:47Z) - Meta-Wrapper: Differentiable Wrapping Operator for User Interest
Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems.
Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success.
We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z) - Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo
sampling [58.14878401145309]
We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model.
We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.
arXiv Detail & Related papers (2022-05-12T11:15:47Z) - Active Learning++: Incorporating Annotator's Rationale using Local Model
Explanation [84.10721065676913]
Annotators can provide their rationale for choosing a label by ranking input features based on their importance for a given query.
Instead of weighing all committee models equally to select the next instance, we assign higher weight to the committee model with higher agreement with the annotator's ranking.
This approach is applicable to any kind of ML model using model-agnostic techniques to generate local explanation such as LIME.
arXiv Detail & Related papers (2020-09-06T08:07:33Z) - Analysis of Multivariate Scoring Functions for Automatic Unbiased
Learning to Rank [14.827143632277274]
AutoULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice.
Recent advances in context-aware learning-to-rank models have shown that multivariate scoring functions, which read multiple documents together and predict their ranking scores jointly, are more powerful than uni-variate ranking functions in ranking tasks with human-annotated relevance labels.
Our experiments with synthetic clicks on two large-scale benchmark datasets show that AutoULTR models with permutation-invariant multivariate scoring functions significantly outperform
arXiv Detail & Related papers (2020-08-20T16:31:59Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.