Adaptively Learning to Select-Rank in Online Platforms
- URL: http://arxiv.org/abs/2406.05017v1
- Date: Fri, 7 Jun 2024 15:33:48 GMT
- Title: Adaptively Learning to Select-Rank in Online Platforms
- Authors: Jingyuan Wang, Perry Dong, Ying Jin, Ruohan Zhan, Zhengyuan Zhou,
- Abstract summary: This research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users.
We develop a user response model that considers diverse user preferences and the varying effects of item positions.
Experiments conducted on both simulated and real-world datasets demonstrate our algorithm outperforms the baseline.
- Score: 34.258659206323664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ranking algorithms are fundamental to various online platforms across e-commerce sites to content streaming services. Our research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users, a key component in personalizing user experience. We develop a user response model that considers diverse user preferences and the varying effects of item positions, aiming to optimize overall user satisfaction with the ranked list. We frame this problem within a contextual bandits framework, with each ranked list as an action. Our approach incorporates an upper confidence bound to adjust predicted user satisfaction scores and selects the ranking action that maximizes these adjusted scores, efficiently solved via maximum weight imperfect matching. We demonstrate that our algorithm achieves a cumulative regret bound of $O(d\sqrt{NKT})$ for ranking $K$ out of $N$ items in a $d$-dimensional context space over $T$ rounds, under the assumption that user responses follow a generalized linear model. This regret alleviates dependence on the ambient action space, whose cardinality grows exponentially with $N$ and $K$ (thus rendering direct application of existing adaptive learning algorithms -- such as UCB or Thompson sampling -- infeasible). Experiments conducted on both simulated and real-world datasets demonstrate our algorithm outperforms the baseline.
Related papers
- Misalignment, Learning, and Ranking: Harnessing Users Limited Attention [16.74322664734553]
We study the design of online algorithms that obtain vanishing regret against optimal benchmarks.
We show how standard algorithms for adversarial online linear optimization can be used to obtain a payoff-time algorithm with $O(sqrtT)$ regret.
arXiv Detail & Related papers (2024-02-21T18:52:20Z) - Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems.
Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order.
We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z) - Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking.
xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics.
We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z) - Fast online ranking with fairness of exposure [29.134493256287072]
We show that our algorithm is computationally fast, generating rankings on-the-fly with computation cost dominated by the sort operation, memory efficient, and has strong theoretical guarantees.
Compared to baseline policies that only maximize user-side performance, our algorithm allows to incorporate complex fairness of exposure criteria in the recommendations with negligible computational overhead.
arXiv Detail & Related papers (2022-09-13T12:35:36Z) - Efficient and Accurate Top-$K$ Recovery from Choice Data [1.14219428942199]
In some applications such as recommendation systems, the statistician is primarily interested in recovering the set of the top ranked items from a large pool of items.
We propose the choice-based Borda count algorithm as a fast and accurate ranking algorithm for top $K$-recovery.
We show that the choice-based Borda count algorithm has optimal sample complexity for top-$K$ recovery under a broad class of random utility models.
arXiv Detail & Related papers (2022-06-23T22:05:08Z) - Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias.
We formalize the personalized collaborative learning problem as optimization of a user's objective.
We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z) - Adaptive Sampling for Heterogeneous Rank Aggregation from Noisy Pairwise
Comparisons [85.5955376526419]
In rank aggregation problems, users exhibit various accuracy levels when comparing pairs of items.
We propose an elimination-based active sampling strategy, which estimates the ranking of items via noisy pairwise comparisons.
We prove that our algorithm can return the true ranking of items with high probability.
arXiv Detail & Related papers (2021-10-08T13:51:55Z) - Learning to Rank under Multinomial Logit Choice [6.929312022493406]
Learning the optimal ordering of content is an important challenge in website design.
We present theoretical analysis leading to an $Omega(sqrtJT)$ lower bound for the problem, and an $tildeO(sqrtJT)$ upper bound on regret of the UCB algorithm.
arXiv Detail & Related papers (2020-09-07T16:15:12Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - SetRank: A Setwise Bayesian Approach for Collaborative Ranking from
Implicit Feedback [50.13745601531148]
We propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to accommodate the characteristics of implicit feedback in recommender system.
Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons.
We also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to $sqrtM/N$.
arXiv Detail & Related papers (2020-02-23T06:40:48Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.