Self-Concordant Analysis of Generalized Linear Bandits with Forgetting
- URL: http://arxiv.org/abs/2011.00819v2
- Date: Thu, 4 Mar 2021 09:37:14 GMT
- Title: Self-Concordant Analysis of Generalized Linear Bandits with Forgetting
- Authors: Yoan Russac (DI-ENS, CNRS, PSL, VALDA), Louis Faury, Olivier Capp\'e
(DI-ENS, VALDA), Aur\'elien Garivier (UMPA-ENSL)
- Abstract summary: We focus on self-concordant GLB (which include logistic regression) with achieved by the use of a Poisson window or exponential weights.
We propose a novel approach to address the potential approach to address the proposed approach to address the Generalized Bandits (GLB) problem.
- Score: 2.282313031205821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextual sequential decision problems with categorical or numerical
observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid
theoretical framework to address them. In contrast to the case of linear
bandits, existing algorithms for GLB have two drawbacks undermining their
applicability. First, they rely on excessively pessimistic concentration bounds
due to the non-linear nature of the model. Second, they require either
non-convex projection steps or burn-in phases to enforce boundedness of the
estimators. Both of these issues are worsened when considering non-stationary
models, in which the GLB parameter may vary with time. In this work, we focus
on self-concordant GLB (which include logistic and Poisson regression) with
forgetting achieved either by the use of a sliding window or exponential
weights. We propose a novel confidence-based algorithm for the maximum-likehood
estimator with forgetting and analyze its perfomance in abruptly changing
environments. These results as well as the accompanying numerical simulations
highlight the potential of the proposed approach to address non-stationarity in
GLB.
Related papers
- Constrained Sampling with Primal-Dual Langevin Monte Carlo [15.634831573546041]
This work considers the problem of sampling from a probability distribution known up to a normalization constant.
It satisfies a set of statistical constraints specified by the expected values of general nonlinear functions.
We put forward a discrete-time primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it.
arXiv Detail & Related papers (2024-11-01T13:26:13Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space.
We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Interior Point Solving for LP-based prediction+optimisation [14.028706088791473]
We investigate the use of the more principled logarithmic barrier term, as widely used in interior point solvers for linear programming.
Our approach performs as good as if not better than the state-of-the-art QPTL (Quadratic Programming task loss) formulation of Wilder et al. and SPO approach of Elmachtoub and Grigas.
arXiv Detail & Related papers (2020-10-26T23:05:21Z) - Consistent Second-Order Conic Integer Programming for Learning Bayesian
Networks [2.7473982588529653]
We study the problem of learning the sparse DAG structure of a BN from continuous observational data.
The optimal solution to this mathematical program is known to have desirable statistical properties under certain conditions.
We propose a concrete early stopping criterion to terminate the branch-and-bound process in order to obtain a near-optimal solution.
arXiv Detail & Related papers (2020-05-29T00:13:15Z) - On the Convergence Rate of Projected Gradient Descent for a
Back-Projection based Objective [58.33065918353532]
We consider a back-projection based fidelity term as an alternative to the common least squares (LS)
We show that using the BP term, rather than the LS term, requires fewer iterations of optimization algorithms.
arXiv Detail & Related papers (2020-05-03T00:58:23Z) - Algorithms for Non-Stationary Generalized Linear Bandits [0.0]
logistic regression is well-known to be preferable to the use of standard linear modeling.
We propose two upper confidence bound based algorithms that make use of either a sliding window or a maximum-likelihood estimator.
We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes.
arXiv Detail & Related papers (2020-03-23T07:44:59Z) - On Low-rank Trace Regression under General Sampling Distribution [9.699586426043885]
We show that cross-validated estimators satisfy near-optimal error bounds on general assumptions.
We also show that the cross-validated estimator outperforms the theory-inspired approach of selecting the parameter.
arXiv Detail & Related papers (2019-04-18T02:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.