Related papers: From Generation to Consumption: Personalized List Value Estimation for Re-ranking

From Generation to Consumption: Personalized List Value Estimation for Re-ranking

URL: http://arxiv.org/abs/2508.02242v2
Date: Thu, 07 Aug 2025 08:51:15 GMT
Title: From Generation to Consumption: Personalized List Value Estimation for Re-ranking
Authors: Kaike Zhang, Xiaobei Wang, Xiaoyu Yang, Shuchang Liu, Hailan Yang, Xiang Li, Fei Sun, Qi Cao,
Abstract summary: We propose CAVE, a personalized Consumption-Aware list Value Estimation framework.<n>CAVE formulates the list value as the expectation over sub-list values, weighted by user-specific exit probabilities at each position.<n>By jointly modeling sub-list values and user exit behavior, CAVE yields a more faithful estimate of actual list consumption value.
Score: 11.827600847399973
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Re-ranking is critical in recommender systems for optimizing the order of recommendation lists, thus improving user satisfaction and platform revenue. Most existing methods follow a generator-evaluator paradigm, where the evaluator estimates the overall value of each candidate list. However, they often ignore the fact that users may exit before consuming the full list, leading to a mismatch between estimated generation value and actual consumption value. To bridge this gap, we propose CAVE, a personalized Consumption-Aware list Value Estimation framework. CAVE formulates the list value as the expectation over sub-list values, weighted by user-specific exit probabilities at each position. The exit probability is decomposed into an interest-driven component and a stochastic component, the latter modeled via a Weibull distribution to capture random external factors such as fatigue. By jointly modeling sub-list values and user exit behavior, CAVE yields a more faithful estimate of actual list consumption value. We further contribute three large-scale real-world list-wise benchmarks from the Kuaishou platform, varying in size and user activity patterns. Extensive experiments on these benchmarks, two Amazon datasets, and online A/B testing on Kuaishou show that CAVE consistently outperforms strong baselines, highlighting the benefit of explicitly modeling user exits in re-ranking.

Related papers

Where is this coming from? Making groundedness count in the evaluation of Document VQA models [12.951716701565019]
We argue that common evaluation metrics do not account for the semantic and multimodal groundedness of a model's outputs.<n>We propose a new evaluation methodology that accounts for the groundedness of predictions.<n>Our proposed methodology is parameterized in such a way that users can configure the score according to their preferences.
arXiv Detail & Related papers (2025-03-24T20:14:46Z)
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment [9.99840965933561]
We propose OneRec, which replaces the cascaded learning framework with a unified generative model.<n>OneRec includes: 1) an encoder-decoder structure, which encodes the user's historical behavior sequences and gradually decodes the videos that the user may be interested in.
arXiv Detail & Related papers (2025-02-26T09:25:10Z)
How to Select Datapoints for Efficient Human Evaluation of NLG Models? [57.60407340254572]
We develop and analyze a suite of selectors to get the most informative datapoints for human evaluation.<n>We show that selectors based on variance in automated metric scores, diversity in model outputs, or Item Response Theory outperform random selection.<n>In particular, we introduce source-based estimators, which predict item usefulness for human evaluation just based on the source texts.
arXiv Detail & Related papers (2025-01-30T10:33:26Z)
Can Large Language Models Understand Preferences in Personalized Recommendation? [32.2250928311146]
We introduce PerRecBench, disassociating evaluation from user rating bias and item quality.<n>We find that the LLM-based recommendation techniques that are generally good at rating prediction fail to identify users' favored and disfavored items when the user rating bias and item quality are eliminated.<n>Our findings reveal the superiority of pairwise and listwise ranking approaches over pointwise ranking, PerRecBench's low correlation with traditional regression metrics, the importance of user profiles, and the role of pretraining data distributions.
arXiv Detail & Related papers (2025-01-23T05:24:18Z)
Beyond Positive History: Re-ranking with List-level Hybrid Feedback [49.52149227298746]
We propose Re-ranking with List-level Hybrid Feedback (dubbed RELIFE) It captures user's preferences and behavior patterns with three modules. Experiments show that RELIFE significantly outperforms SOTA re-ranking baselines.
arXiv Detail & Related papers (2024-10-28T06:39:01Z)
Item-based Variational Auto-encoder for Fair Music Recommendation [1.8782288713227568]
The EvalRS DataChallenge aims to build a more realistic recommender system considering accuracy, fairness, and diversity in evaluation. Our proposed system is based on an ensemble between an item-based variational auto-encoder (VAE) and a Bayesian personalized ranking matrix factorization (BPRMF)
arXiv Detail & Related papers (2022-10-24T06:42:16Z)
Recommendation Systems with Distribution-Free Reliability Guarantees [83.80644194980042]
We show how to return a set of items rigorously guaranteed to contain mostly good items. Our procedure endows any ranking model with rigorous finite-sample control of the false discovery rate. We evaluate our methods on the Yahoo! Learning to Rank and MSMarco datasets.
arXiv Detail & Related papers (2022-07-04T17:49:25Z)
PEAR: Personalized Re-ranking with Contextualized Transformer for Recommendation [48.17295872384401]
We present a personalized re-ranking model (dubbed PEAR) based on contextualized transformer. PEAR makes several major improvements over the existing methods. We also augment the training of PEAR with a list-level classification task to assess users' satisfaction on the whole ranking list.
arXiv Detail & Related papers (2022-03-23T08:29:46Z)
PURS: Personalized Unexpected Recommender System for Improving User Satisfaction [76.98616102965023]
We describe a novel Personalized Unexpected Recommender System (PURS) model that incorporates unexpectedness into the recommendation process. Extensive offline experiments on three real-world datasets illustrate that the proposed PURS model significantly outperforms the state-of-the-art baseline approaches.
arXiv Detail & Related papers (2021-06-05T01:33:21Z)
Set2setRank: Collaborative Set to Set Ranking for Implicit Feedback based Recommendation [59.183016033308014]
In this paper, we explore the unique characteristics of the implicit feedback and propose Set2setRank framework for recommendation. Our proposed framework is model-agnostic and can be easily applied to most recommendation prediction approaches.
arXiv Detail & Related papers (2021-05-16T08:06:22Z)
SetRank: A Setwise Bayesian Approach for Collaborative Ranking from Implicit Feedback [50.13745601531148]
We propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to accommodate the characteristics of implicit feedback in recommender system. Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons. We also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to $sqrtM/N$.
arXiv Detail & Related papers (2020-02-23T06:40:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.