A Model-based Multi-Agent Personalized Short-Video Recommender System
- URL: http://arxiv.org/abs/2405.01847v1
- Date: Fri, 3 May 2024 04:34:36 GMT
- Title: A Model-based Multi-Agent Personalized Short-Video Recommender System
- Authors: Peilun Zhou, Xiaoxiao Xu, Lantao Hu, Han Li, Peng Jiang,
- Abstract summary: We propose a RL-based industrial short-video recommender ranking framework.
Our proposed framework adopts a model-based learning approach to alleviate the sample selection bias.
Our proposed approach has been deployed in our real large-scale short-video sharing platform.
- Score: 19.03089585214444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recommender selects and presents top-K items to the user at each online request, and a recommendation session consists of several sequential requests. Formulating a recommendation session as a Markov decision process and solving it by reinforcement learning (RL) framework has attracted increasing attention from both academic and industry communities. In this paper, we propose a RL-based industrial short-video recommender ranking framework, which models and maximizes user watch-time in an environment of user multi-aspect preferences by a collaborative multi-agent formulization. Moreover, our proposed framework adopts a model-based learning approach to alleviate the sample selection bias which is a crucial but intractable problem in industrial recommender system. Extensive offline evaluations and live experiments confirm the effectiveness of our proposed method over alternatives. Our proposed approach has been deployed in our real large-scale short-video sharing platform, successfully serving over hundreds of millions users.
Related papers
- Pareto Front Approximation for Multi-Objective Session-Based Recommender Systems [0.0]
MultiTRON is an approach that adapts approximation techniques to multi-objective session-based recommender systems.
Our approach optimize trade-offs between key metrics such as click-through and conversion rates by training on sampled preference vectors.
We validate the model's performance through extensive offline and online evaluation.
arXiv Detail & Related papers (2024-07-23T20:38:23Z) - Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation [77.42486522565295]
We propose a novel recommendation approach called LSVCR to jointly conduct personalized video and comment recommendation.
Our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender.
In particular, we achieve a significant overall gain of 4.13% in comment watch time.
arXiv Detail & Related papers (2024-03-20T13:14:29Z) - Mirror Gradient: Towards Robust Multimodal Recommender Systems via
Exploring Flat Local Minima [54.06000767038741]
We analyze multimodal recommender systems from the novel perspective of flat local minima.
We propose a concise yet effective gradient strategy called Mirror Gradient (MG)
We find that the proposed MG can complement existing robust training methods and be easily extended to diverse advanced recommendation models.
arXiv Detail & Related papers (2024-02-17T12:27:30Z) - Curriculum-scheduled Knowledge Distillation from Multiple Pre-trained Teachers for Multi-domain Sequential Recommendation [102.91236882045021]
It is essential to explore how to use different pre-trained recommendation models efficiently in real-world systems.
We propose a novel curriculum-scheduled knowledge distillation from multiple pre-trained teachers for multi-domain sequential recommendation.
CKD-MDSR takes full advantages of different PRMs as multiple teacher models to boost a small student recommendation model.
arXiv Detail & Related papers (2024-01-01T15:57:15Z) - Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective [11.31980071390936]
We present a novel podcast recommender system deployed at industrial scale.
In deviating from the pervasive industry practice of optimizing machine learning algorithms for short-term proxy metrics, the system substantially improves long-term performance in A/B tests.
arXiv Detail & Related papers (2023-02-07T16:17:25Z) - Constrained Reinforcement Learning for Short Video Recommendation [18.492477839791274]
Short videos on social media platforms pose new challenges to optimize recommender systems.
We propose a two-stage reinforcement learning approach based on actor-critic framework.
Our approach has been fully launched in the production system to optimize user experiences.
arXiv Detail & Related papers (2022-05-26T09:36:20Z) - A Review on Pushing the Limits of Baseline Recommendation Systems with
the integration of Opinion Mining & Information Retrieval Techniques [0.0]
Recommendation Systems allow users to identify trending items among a community while being timely and relevant to the user's expectations.
Deep Learning methods have been brought forward to achieve better quality recommendations.
Researchers have tried to expand on the capabilities of standard recommendation systems to provide the most effective recommendations.
arXiv Detail & Related papers (2022-05-03T22:13:33Z) - Offline Meta-level Model-based Reinforcement Learning Approach for
Cold-Start Recommendation [27.17948754183511]
Reinforcement learning has shown great promise in optimizing long-term user interest in recommender systems.
Existing RL-based recommendation methods need a large number of interactions for each user to learn a robust recommendation policy.
We propose a meta-level model-based reinforcement learning approach for fast user adaptation.
arXiv Detail & Related papers (2020-12-04T08:58:35Z) - PinnerSage: Multi-Modal User Embedding Framework for Recommendations at
Pinterest [54.56236567783225]
PinnerSage is an end-to-end recommender system that represents each user via multi-modal embeddings.
We conduct several offline and online A/B experiments to show that our method significantly outperforms single embedding methods.
arXiv Detail & Related papers (2020-07-07T17:13:20Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.