Related papers: Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling

Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling

URL: http://arxiv.org/abs/2104.15046v1
Date: Fri, 30 Apr 2021 15:16:35 GMT
Title: Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling
Authors: Simen Eide, David S. Leslie, Arnoldo Frigessi
Abstract summary: We consider the problem of recommending relevant content to users of an internet platform in the form of lists of items, called slates. We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user. We show experimentally that explorative recommender strategies perform on par or above their greedy counterparts.
Score: 6.312395952874578
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider the problem of recommending relevant content to users of an internet platform in the form of lists of items, called slates. We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user, and which scales to real world industrial situations. The recommender system is tested both online on real users, and on an offline dataset collected from a Norwegian web-based marketplace, FINN.no, that is made public for research. This is one of the first publicly available datasets which includes all the slates that are presented to users as well as which items (if any) in the slates were clicked on. Such a data set allows us to move beyond the common assumption that implicitly assumes that users are considering all possible items at each interaction. Instead we build our likelihood using the items that are actually in the slate, and evaluate the strengths and weaknesses of both approaches theoretically and in experiments. We also introduce a hierarchical prior for the item parameters based on group memberships. Both item parameters and user preferences are learned probabilistically. Furthermore, we combine our model with bandit strategies to ensure learning, and introduce `in-slate Thompson Sampling' which makes use of the slates to maximise explorative opportunities. We show experimentally that explorative recommender strategies perform on par or above their greedy counterparts. Even without making use of exploration to learn more effectively, click rates increase simply because of improved diversity in the recommended slates.

Related papers

Epinet for Content Cold Start [14.018820788546535]
epinets enables efficient approximations of Thompson sampling even when the learning model is a complex neural network. Our experiments demonstrate improvements in both user traffic and engagement efficiency on the Facebook Reels online video platform.
arXiv Detail & Related papers (2024-11-20T19:43:27Z)
Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content [66.71102704873185]
We test for user strategization by conducting a lab experiment and survey. We find strong evidence of strategization across outcome metrics, including participants' dwell time and use of "likes" Our findings suggest that platforms cannot ignore the effect of their algorithms on user behavior.
arXiv Detail & Related papers (2024-05-09T07:36:08Z)
AutoSAM: Towards Automatic Sampling of User Behaviors for Sequential Recommender Systems [48.461157194277504]
We propose a general automatic sampling framework, named AutoSAM, to non-uniformly treat historical behaviors. Specifically, AutoSAM augments the standard sequential recommendation architecture with an additional sampler layer to adaptively learn the skew distribution of the raw input. We theoretically design multi-objective sampling rewards including Future Prediction and Sequence Perplexity, and then optimize the whole framework in an end-to-end manner.
arXiv Detail & Related papers (2023-11-01T09:25:21Z)
PARSRec: Explainable Personalized Attention-fused Recurrent Sequential Recommendation Using Session Partial Actions [0.5801044612920815]
We propose an architecture that relies on common patterns as well as individual behaviors to tailor its recommendations for each person. Our empirical results on Nielsen Consumer Panel dataset indicate that the proposed approach achieves up to 27.9% performance improvement.
arXiv Detail & Related papers (2022-09-16T12:07:43Z)
Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system. Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations. While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z)
Learning Personalized Item-to-Item Recommendation Metric via Implicit Feedback [24.37151414523712]
This paper studies the item-to-item recommendation problem in recommender systems from a new perspective of metric learning via implicit feedback. We develop and investigate a personalizable deep metric model that captures both the internal contents of items and how they were interacted with by users.
arXiv Detail & Related papers (2022-03-18T18:08:57Z)
FINN.no Slates Dataset: A new Sequential Dataset Logging Interactions, allViewed Items and Click Responses/No-Click for Recommender Systems Research [4.792216056979392]
We present a novel recommender systems dataset that records the sequential interactions between users and an online marketplace. The dataset includes the presented slates at each round, whether the user clicked on any of these items and which item the user clicked on.
arXiv Detail & Related papers (2021-11-05T09:21:58Z)
Set2setRank: Collaborative Set to Set Ranking for Implicit Feedback based Recommendation [59.183016033308014]
In this paper, we explore the unique characteristics of the implicit feedback and propose Set2setRank framework for recommendation. Our proposed framework is model-agnostic and can be easily applied to most recommendation prediction approaches.
arXiv Detail & Related papers (2021-05-16T08:06:22Z)
Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback [62.997667081978825]
We present a novel approach for considering user feedback and evaluate it using three distinct strategies. Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.
arXiv Detail & Related papers (2020-09-16T07:32:51Z)
Deep Bayesian Bandits: Exploring in Online Personalized Recommendations [4.845576821204241]
We formulate a display advertising recommender as a contextual bandit. We implement exploration techniques that require sampling from the posterior distribution of click-through-rates. We test our proposed deep Bayesian bandits algorithm in the offline simulation and online AB setting.
arXiv Detail & Related papers (2020-08-03T08:58:18Z)
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users [111.28351584726092]
We consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively. Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play.
arXiv Detail & Related papers (2020-05-23T08:56:37Z)
Controllable Multi-Interest Framework for Recommendation [64.30030600415654]
We formalize the recommender system as a sequential recommendation problem. We propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our framework has been successfully deployed on the offline Alibaba distributed cloud platform.
arXiv Detail & Related papers (2020-05-19T10:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.