Non-Stationary Bandits with Auto-Regressive Temporal Dependency
- URL: http://arxiv.org/abs/2210.16386v3
- Date: Tue, 12 Dec 2023 18:42:37 GMT
- Title: Non-Stationary Bandits with Auto-Regressive Temporal Dependency
- Authors: Qinyi Chen, Negin Golrezaei, Djallel Bouneffouf
- Abstract summary: This paper introduces a novel non-stationary MAB framework that captures the temporal structure of real-world dynamics through an auto-regressive (AR) reward structure.
We propose an algorithm that integrates two key mechanisms: (i) an alternation mechanism adept at leveraging temporal dependencies to dynamically balance exploration and exploitation, and (ii) a restarting mechanism designed to discard out-of-date information.
- Score: 14.093856726745662
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional multi-armed bandit (MAB) frameworks, predominantly examined under
stochastic or adversarial settings, often overlook the temporal dynamics
inherent in many real-world applications such as recommendation systems and
online advertising. This paper introduces a novel non-stationary MAB framework
that captures the temporal structure of these real-world dynamics through an
auto-regressive (AR) reward structure. We propose an algorithm that integrates
two key mechanisms: (i) an alternation mechanism adept at leveraging temporal
dependencies to dynamically balance exploration and exploitation, and (ii) a
restarting mechanism designed to discard out-of-date information. Our algorithm
achieves a regret upper bound that nearly matches the lower bound, with regret
measured against a robust dynamic benchmark. Finally, via a real-world case
study on tourism demand prediction, we demonstrate both the efficacy of our
algorithm and the broader applicability of our techniques to more complex,
rapidly evolving time series.
Related papers
- Autoregressive Moving-average Attention Mechanism for Time Series Forecasting [9.114664059026767]
We propose an Autoregressive (AR) Moving-average (MA) attention structure that can adapt to various linear attention mechanisms.
In this paper, we first demonstrate that, for the time series forecasting (TSF) task, the previously overlooked decoder-only autoregressive Transformer model can achieve results comparable to the best baselines.
arXiv Detail & Related papers (2024-10-04T05:45:50Z) - SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management.
Traditional methods often struggle to capture the complex dynamics of meteorological systems.
We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z) - GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks [24.323017830938394]
This work aims to address challenges by introducing a pre-training framework that seamlessly integrates with baselines and enhances their performance.
The framework is built upon two key designs: (i) We propose a.
apple-to-apple mask autoencoder as a pre-training model for learning-temporal dependencies.
These modules are specifically designed to capture intra-temporal customized representations and semantic- and inter-cluster relationships.
arXiv Detail & Related papers (2023-11-07T02:36:24Z) - Revisiting the Temporal Modeling in Spatio-Temporal Predictive Learning
under A Unified View [73.73667848619343]
We introduce USTEP (Unified S-TEmporal Predictive learning), an innovative framework that reconciles the recurrent-based and recurrent-free methods by integrating both micro-temporal and macro-temporal scales.
arXiv Detail & Related papers (2023-10-09T16:17:42Z) - Easy attention: A simple attention mechanism for temporal predictions with transformers [2.172584429650463]
We show that the keys, queries and softmax are not necessary for obtaining the attention score required to capture long-term dependencies in temporal sequences.
Our proposed easy-attention method directly treats the attention scores as learnable parameters.
This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems.
arXiv Detail & Related papers (2023-08-24T15:54:32Z) - Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations [98.5802673062712]
We introduce temporally-coupled perturbations, presenting a novel challenge for existing robust reinforcement learning methods.
We propose GRAD, a novel game-theoretic approach that treats the temporally-coupled robust RL problem as a partially observable two-player zero-sum game.
arXiv Detail & Related papers (2023-07-22T12:10:04Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Accelerated Continuous-Time Approximate Dynamic Programming via
Data-Assisted Hybrid Control [0.0]
We introduce an algorithm that incorporates dynamic momentum in actor-critic structures to control continuous-time dynamic plants with an affine structure in the input.
By incorporating dynamic momentum in our algorithm, we are able to accelerate the convergence properties of the closed-loop system.
arXiv Detail & Related papers (2022-04-27T05:36:51Z) - Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement
Learning Approach [130.9259586568977]
We propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction.
A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space.
arXiv Detail & Related papers (2022-02-25T16:17:23Z) - Stochastically forced ensemble dynamic mode decomposition for
forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system.
We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony.
Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.