Related papers: Statistical Inference under Adaptive Sampling with LinUCB

Statistical Inference under Adaptive Sampling with LinUCB

URL: http://arxiv.org/abs/2512.00222v1
Date: Fri, 28 Nov 2025 21:48:18 GMT
Title: Statistical Inference under Adaptive Sampling with LinUCB
Authors: Wei Fan, Kevin Tan, Yuting Wei,
Abstract summary: We show that the linear upper confidence bound (LinUCB) algorithm for linear bandits satisfies a property called stability.<n>We establish a central limit theorem for the LinUCB algorithm, establishing normality for the limiting distribution of the estimation error.
Score: 15.167069362020426
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adaptively collected data has become ubiquitous within modern practice. However, even seemingly benign adaptive sampling schemes can introduce severe biases, rendering traditional statistical inference tools inapplicable. This can be mitigated by a property called stability, which states that if the rate at which an algorithm takes actions converges to a deterministic limit, one can expect that certain parameters are asymptotically normal. Building on a recent line of work for the multi-armed bandit setting, we show that the linear upper confidence bound (LinUCB) algorithm for linear bandits satisfies this property. In doing so, we painstakingly characterize the behavior of the eigenvalues and eigenvectors of the random design feature covariance matrix in the setting where the action set is the unit ball, showing that it decomposes into a rank-one direction that locks onto the true parameter and an almost-isotropic bulk that grows at a predictable $\sqrt{T}$ rate. This allows us to establish a central limit theorem for the LinUCB algorithm, establishing asymptotic normality for the limiting distribution of the estimation error where the convergence occurs at a $T^{-1/4}$ rate. The resulting Wald-type confidence sets and hypothesis tests do not depend on the feature covariance matrix and are asymptotically tighter than existing nonasymptotic confidence sets. Numerical simulations corroborate our findings.

Related papers

Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback [50.89125374999765]
We provide the first convergence guarantee for Optimistic Multiplicative Weights Update ($mathtOMWU$) in NLHF.<n>Our analysis identifies a novel marginal convergence behavior, where the probability of rarely played actions grows exponentially from exponentially small values.
arXiv Detail & Related papers (2025-12-31T12:08:29Z)
Avoiding the Price of Adaptivity: Inference in Linear Contextual Bandits via Stability [2.5782420501870296]
We argue that stability and statistical efficiency can coexist within a single contextual bandit method.<n>We show that our algorithm achieves regret guarantees that are minimax optimal up to logarithmic factors.
arXiv Detail & Related papers (2025-12-23T13:53:53Z)
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations [57.179679246370114]
We identify the distribution of random perturbations that minimizes the estimator's variance as the perturbation stepsize tends to zero.<n>Our findings reveal that such desired perturbations can align directionally with the true gradient, instead of maintaining a fixed length.
arXiv Detail & Related papers (2025-10-22T19:06:39Z)
Pointwise confidence estimation in the non-linear $\ell^2$-regularized least squares [12.352761060862072]
We consider a high-probability non-asymptotic confidence estimation in the $ell2$-regularized non-linear least-squares setting with fixed design.<n>We show a pointwise confidence bound, meaning that it holds for the prediction on any given fixed test input $x$.
arXiv Detail & Related papers (2025-06-08T11:23:49Z)
C-Learner: Constrained Learning for Causal Inference [4.370964009390564]
We propose a novel debiasing approach that achieves the best weighting of both worlds, producing stable plug-in estimates.<n>Our constrained learning framework solves for the best plug-in estimator under the constraint that the first-order error with respect to the plugged-in quantity is zero.
arXiv Detail & Related papers (2024-05-15T16:38:28Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations [44.71254888821376]
We provide the first type-I-error and expected-rejection-time guarantees under general non-data generating processes. We show how to apply our results to inference on parameters defined by estimating equations, such as average treatment effects.
arXiv Detail & Related papers (2022-12-29T18:37:08Z)
A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean [9.289846887298854]
A confidence sequence produces an adapted sequence of sets for a predictable parameter sequence with a time-parametric coverage guarantee. This work constructs a non-asymptotic lower CS for the running average conditional expectation whose slack converges to zero.
arXiv Detail & Related papers (2022-10-20T09:50:05Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.