Algorithms for Non-Stationary Generalized Linear Bandits
- URL: http://arxiv.org/abs/2003.10113v1
- Date: Mon, 23 Mar 2020 07:44:59 GMT
- Title: Algorithms for Non-Stationary Generalized Linear Bandits
- Authors: Yoan Russac (DI-ENS), Olivier Capp\'e (DI-ENS), Aur\'elien Garivier
(UMPA-ENSL)
- Abstract summary: logistic regression is well-known to be preferable to the use of standard linear modeling.
We propose two upper confidence bound based algorithms that make use of either a sliding window or a maximum-likelihood estimator.
We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The statistical framework of Generalized Linear Models (GLM) can be applied
to sequential problems involving categorical or ordinal rewards associated, for
instance, with clicks, likes or ratings. In the example of binary rewards,
logistic regression is well-known to be preferable to the use of standard
linear modeling. Previous works have shown how to deal with GLMs in contextual
online learning with bandit feedback when the environment is assumed to be
stationary. In this paper, we relax this latter assumption and propose two
upper confidence bound based algorithms that make use of either a sliding
window or a discounted maximum-likelihood estimator. We provide theoretical
guarantees on the behavior of these algorithms for general context sequences
and in the presence of abrupt changes. These results take the form of high
probability upper bounds for the dynamic regret that are of order d^2/3 G^1/3
T^2/3 , where d, T and G are respectively the dimension of the unknown
parameter, the number of rounds and the number of breakpoints up to time T. The
empirical performance of the algorithms is illustrated in simulated
environments.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Online and Offline Robust Multivariate Linear Regression [0.3277163122167433]
We introduce two methods each considered contrast: (i) online gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms.
Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based gradient descent algorithms.
arXiv Detail & Related papers (2024-04-30T12:30:48Z) - Learning a Gaussian Mixture for Sparsity Regularization in Inverse
Problems [2.375943263571389]
In inverse problems, the incorporation of a sparsity prior yields a regularization effect on the solution.
We propose a probabilistic sparsity prior formulated as a mixture of Gaussians, capable of modeling sparsity with respect to a generic basis.
We put forth both a supervised and an unsupervised training strategy to estimate the parameters of this network.
arXiv Detail & Related papers (2024-01-29T22:52:57Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Sharp global convergence guarantees for iterative nonconvex
optimization: A Gaussian process perspective [30.524043513721168]
We develop a general recipe for analyzing the convergence of iterative algorithms for a class of regression models.
deterministicly, we accurately capture both the convergence rate of the algorithm and the eventual error floor in the finite-sample regime.
We show sharp convergence rates for both higher-order algorithms based on alternating updates and first-order algorithms based on subgradient subgradients.
arXiv Detail & Related papers (2021-09-20T21:48:19Z) - Identification and Adaptation with Binary-Valued Observations under
Non-Persistent Excitation Condition [1.6897716547971817]
We propose an online projected Quasi-Newton type algorithm for estimation of parameter estimation of regression models with binary-valued observations.
We establish the strong consistency of the estimation algorithm and provide the convergence rate.
Convergence of adaptive predictors and their applications in adaptive control are also discussed.
arXiv Detail & Related papers (2021-07-08T03:57:50Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Self-Concordant Analysis of Generalized Linear Bandits with Forgetting [2.282313031205821]
We focus on self-concordant GLB (which include logistic regression) with achieved by the use of a Poisson window or exponential weights.
We propose a novel approach to address the potential approach to address the proposed approach to address the Generalized Bandits (GLB) problem.
arXiv Detail & Related papers (2020-11-02T08:36:39Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - When Does Preconditioning Help or Hurt Generalization? [74.25170084614098]
We show how the textitimplicit bias of first and second order methods affects the comparison of generalization properties.
We discuss several approaches to manage the bias-variance tradeoff, and the potential benefit of interpolating between GD and NGD.
arXiv Detail & Related papers (2020-06-18T17:57:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.