HyperBandit: Contextual Bandit with Hypernewtork for Time-Varying User
Preferences in Streaming Recommendation
- URL: http://arxiv.org/abs/2308.08497v1
- Date: Mon, 14 Aug 2023 14:04:57 GMT
- Title: HyperBandit: Contextual Bandit with Hypernewtork for Time-Varying User
Preferences in Streaming Recommendation
- Authors: Chenglei Shen, Xiao Zhang, Wei Wei, Jun Xu
- Abstract summary: Existing streaming recommender models only consider time as a timestamp.
We propose a contextual bandit approach using hypernetwork, called HyperBandit.
We show that the proposed HyperBandit consistently outperforms the state-of-the-art baselines in terms of accumulated rewards.
- Score: 11.908362247624131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world streaming recommender systems, user preferences often
dynamically change over time (e.g., a user may have different preferences
during weekdays and weekends). Existing bandit-based streaming recommendation
models only consider time as a timestamp, without explicitly modeling the
relationship between time variables and time-varying user preferences. This
leads to recommendation models that cannot quickly adapt to dynamic scenarios.
To address this issue, we propose a contextual bandit approach using
hypernetwork, called HyperBandit, which takes time features as input and
dynamically adjusts the recommendation model for time-varying user preferences.
Specifically, HyperBandit maintains a neural network capable of generating the
parameters for estimating time-varying rewards, taking into account the
correlation between time features and user preferences. Using the estimated
time-varying rewards, a bandit policy is employed to make online
recommendations by learning the latent item contexts. To meet the real-time
requirements in streaming recommendation scenarios, we have verified the
existence of a low-rank structure in the parameter matrix and utilize low-rank
factorization for efficient training. Theoretically, we demonstrate a sublinear
regret upper bound against the best policy. Extensive experiments on real-world
datasets show that the proposed HyperBandit consistently outperforms the
state-of-the-art baselines in terms of accumulated rewards.
Related papers
- Modeling the Heterogeneous Duration of User Interest in Time-Dependent Recommendation: A Hidden Semi-Markov Approach [11.392605386729699]
We propose a hidden semi-Markov model to track the change of users' interests.
This model allows for capturing the different durations of user stays in a (latent) interest state.
We derive an algorithm to estimate the parameters and predict users' actions.
arXiv Detail & Related papers (2024-12-15T09:17:45Z) - Sequential Recommendation on Temporal Proximities with Contrastive
Learning and Self-Attention [3.7182810519704095]
Sequential recommender systems identify user preferences from their past interactions to predict subsequent items optimally.
Recent models often neglect similarities in users' actions that occur implicitly among users during analogous timeframes.
We propose a sequential recommendation model called TemProxRec, which includes contrastive learning and self-attention methods to consider temporal proximities.
arXiv Detail & Related papers (2024-02-15T08:33:16Z) - Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits [55.03293214439741]
In contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience.
We propose the first online continuous hyperparameter tuning framework for contextual bandits.
We show that it could achieve a sublinear regret in theory and performs consistently better than all existing methods on both synthetic and real datasets.
arXiv Detail & Related papers (2023-02-18T23:31:20Z) - Time-aware Hyperbolic Graph Attention Network for Session-based
Recommendation [58.748215444851226]
Session-based Recommendation (SBR) is to predict users' next interested items based on their previous browsing sessions.
We propose Time-aware Hyperbolic Graph Attention Network (TA-HGAT) to build a session-based recommendation model considering temporal information.
arXiv Detail & Related papers (2023-01-10T04:16:09Z) - Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform.
Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online.
Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z) - Modeling Dynamic User Preference via Dictionary Learning for Sequential
Recommendation [133.8758914874593]
Capturing the dynamics in user preference is crucial to better predict user future behaviors because user preferences often drift over time.
Many existing recommendation algorithms -- including both shallow and deep ones -- often model such dynamics independently.
This paper considers the problem of embedding a user's sequential behavior into the latent space of user preferences.
arXiv Detail & Related papers (2022-04-02T03:23:46Z) - Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in
Contextual Bandit Algorithms [74.55200180156906]
The contextual bandit problem models the trade-off between exploration and exploitation.
We show our Syndicated Bandits framework can achieve the optimal regret upper bounds.
arXiv Detail & Related papers (2021-06-05T22:30:21Z) - Learning Heterogeneous Temporal Patterns of User Preference for Timely
Recommendation [15.930016839929047]
We propose a novel recommender system for timely recommendations, called TimelyRec.
In TimelyRec, a cascade of two encoders captures the temporal patterns of user preference using a proposed attention module for each encoder.
Our experiments on a scenario for item recommendation and the proposed scenario for item-timing recommendation on real-world datasets demonstrate the superiority of TimelyRec.
arXiv Detail & Related papers (2021-04-29T08:37:30Z) - Non-Stationary Latent Bandits [68.21614490603758]
We propose a practical approach for fast personalization to non-stationary users.
The key idea is to frame this problem as a latent bandit, where prototypical models of user behavior are learned offline and the latent state of the user is inferred online.
We propose Thompson sampling algorithms for regret minimization in non-stationary latent bandits, analyze them, and evaluate them on a real-world dataset.
arXiv Detail & Related papers (2020-12-01T10:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.