Related papers: Algorithms for Non-Stationary Generalized Linear Bandits

Algorithms for Non-Stationary Generalized Linear Bandits

URL: http://arxiv.org/abs/2003.10113v1
Date: Mon, 23 Mar 2020 07:44:59 GMT
Title: Algorithms for Non-Stationary Generalized Linear Bandits
Authors: Yoan Russac (DI-ENS), Olivier Capp\'e (DI-ENS), Aur\'elien Garivier (UMPA-ENSL)
Abstract summary: logistic regression is well-known to be preferable to the use of standard linear modeling. We propose two upper confidence bound based algorithms that make use of either a sliding window or a maximum-likelihood estimator. We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings. In the example of binary rewards, logistic regression is well-known to be preferable to the use of standard linear modeling. Previous works have shown how to deal with GLMs in contextual online learning with bandit feedback when the environment is assumed to be stationary. In this paper, we relax this latter assumption and propose two upper confidence bound based algorithms that make use of either a sliding window or a discounted maximum-likelihood estimator. We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes. These results take the form of high probability upper bounds for the dynamic regret that are of order d^2/3 G^1/3 T^2/3 , where d, T and G are respectively the dimension of the unknown parameter, the number of rounds and the number of breakpoints up to time T. The empirical performance of the algorithms is illustrated in simulated environments.

Related papers

Learning sparse generalized linear models with binary outcomes via iterative hard thresholding [15.283757486793226]
In statistics, generalized linear models (GLMs) are widely used for modeling data. In this work, we propose to use and analyze an iterative hard thresholding (projected gradient descent on the ReLU loss) algorithm, called binary iterative hard thresholding (BIHT) We establish that BIHT is statistically efficient and converges to the correct solution for parameter estimation in a general class of sparse binary GLMs.
arXiv Detail & Related papers (2025-02-25T17:42:33Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Online and Offline Robust Multivariate Linear Regression [0.3277163122167433]
We introduce two methods each considered contrast: (i) online gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms. Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based gradient descent algorithms.
arXiv Detail & Related papers (2024-04-30T12:30:48Z)
Learning a Gaussian Mixture for Sparsity Regularization in Inverse Problems [2.375943263571389]
In inverse problems, the incorporation of a sparsity prior yields a regularization effect on the solution. We propose a probabilistic sparsity prior formulated as a mixture of Gaussians, capable of modeling sparsity with respect to a generic basis. We put forth both a supervised and an unsupervised training strategy to estimate the parameters of this network.
arXiv Detail & Related papers (2024-01-29T22:52:57Z)
On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD) We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting. We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z)
Sharp global convergence guarantees for iterative nonconvex optimization: A Gaussian process perspective [30.524043513721168]
We develop a general recipe for analyzing the convergence of iterative algorithms for a class of regression models. deterministicly, we accurately capture both the convergence rate of the algorithm and the eventual error floor in the finite-sample regime. We show sharp convergence rates for both higher-order algorithms based on alternating updates and first-order algorithms based on subgradient subgradients.
arXiv Detail & Related papers (2021-09-20T21:48:19Z)
Identification and Adaptation with Binary-Valued Observations under Non-Persistent Excitation Condition [1.6897716547971817]
We propose an online projected Quasi-Newton type algorithm for estimation of parameter estimation of regression models with binary-valued observations. We establish the strong consistency of the estimation algorithm and provide the convergence rate. Convergence of adaptive predictors and their applications in adaptive control are also discussed.
arXiv Detail & Related papers (2021-07-08T03:57:50Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Self-Concordant Analysis of Generalized Linear Bandits with Forgetting [2.282313031205821]
We focus on self-concordant GLB (which include logistic regression) with achieved by the use of a Poisson window or exponential weights. We propose a novel approach to address the potential approach to address the proposed approach to address the Generalized Bandits (GLB) problem.
arXiv Detail & Related papers (2020-11-02T08:36:39Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
When Does Preconditioning Help or Hurt Generalization? [74.25170084614098]
We show how the textitimplicit bias of first and second order methods affects the comparison of generalization properties. We discuss several approaches to manage the bias-variance tradeoff, and the potential benefit of interpolating between GD and NGD.
arXiv Detail & Related papers (2020-06-18T17:57:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.