Related papers: Statistical Inference for Online Decision Making via Stochastic Gradient Descent

Statistical Inference for Online Decision Making via Stochastic Gradient Descent

URL: http://arxiv.org/abs/2010.07341v1
Date: Wed, 14 Oct 2020 18:25:18 GMT
Title: Statistical Inference for Online Decision Making via Stochastic Gradient Descent
Authors: Haoyu Chen, Wenbin Lu, Rui Song
Abstract summary: We propose an online algorithm that can make decisions and update the decision rule online via gradient descent. It is not only efficient but also supports all kinds of parametric reward models. The proposed algorithm and theoretical results are tested by simulations and a real data application to news article recommendation.
Score: 31.103438051597887
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online decision making aims to learn the optimal decision rule by making personalized decisions and updating the decision rule recursively. It has become easier than before with the help of big data, but new challenges also come along. Since the decision rule should be updated once per step, an offline update which uses all the historical data is inefficient in computation and storage. To this end, we propose a completely online algorithm that can make decisions and update the decision rule online via stochastic gradient descent. It is not only efficient but also supports all kinds of parametric reward models. Focusing on the statistical inference of online decision making, we establish the asymptotic normality of the parameter estimator produced by our algorithm and the online inverse probability weighted value estimator we used to estimate the optimal value. Online plugin estimators for the variance of the parameter and value estimators are also provided and shown to be consistent, so that interval estimation and hypothesis test are possible using our method. The proposed algorithm and theoretical results are tested by simulations and a real data application to news article recommendation.

Related papers

A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding [68.43987626137512]
We propose a principled framework for randomized decision-making based on interval estimates of the quality of each item.<n>We introduce MERIT, an optimization-based method that maximizes the worst-case expected number of top candidates selected.<n>We prove that MERIT satisfies desirable axiomatic properties not guaranteed by existing approaches.
arXiv Detail & Related papers (2025-06-23T19:59:30Z)
Online Statistical Inference of Constrained Stochastic Optimization via Random Scaling [17.9255078650875]
We develop an online inference procedure for constrained optimization using Sketched Sequential Quadratic Programming (SSQP)<n>As a direct generalization of sketched Newton methods, SSQP approximates the objective with a quadratic model and the constraints with a linear model at each step, then applies a sketching solver to in solve the resulting subproblem.<n>In particular, we construct a test statistic based on SSQP iterates whose limiting distribution is free of any unknown parameters.
arXiv Detail & Related papers (2025-05-23T19:33:08Z)
Online Decision-Focused Learning [63.83903681295497]
Decision-focused learning (DFL) is an increasingly popular paradigm for training predictive models whose outputs are used in decision-making tasks.<n>We investigate DFL in dynamic environments where the objective function does not evolve over time.<n>We establish bounds on the expected dynamic regret, both when decision space is a simplex and when it is a general bounded convex polytope.
arXiv Detail & Related papers (2025-05-19T10:40:30Z)
Adaptive debiased SGD in high-dimensional GLMs with streaming data [4.704144189806667]
This paper introduces a novel approach to online inference in high-dimensional generalized linear models. Our method operates in a single-pass mode, making it different from existing methods that require full dataset access or large-dimensional summary statistics storage. The core of our methodological innovation lies in an adaptive descent algorithm tailored for dynamic objective functions, coupled with a novel online debiasing procedure.
arXiv Detail & Related papers (2024-05-28T15:36:48Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
Online Estimation via Offline Estimation: An Information-Theoretic Framework [75.80823630681323]
Motivated by connections between estimation and interactive decision making, we ask: is it possible to convert offline estimation algorithms into online estimation algorithms? We introduce a new framework, Oracle-Efficient Online Estimation (OEOE), where the learner can only interact with the data stream indirectly through a sequence of offline estimators produced by a black-box algorithm operating on the stream.
arXiv Detail & Related papers (2024-04-15T20:19:18Z)
Online Tensor Inference [0.0]
Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for high-dimensional tensor data. Existing low-rank tensor methods lack the capability for statistical inference in an online fashion. Our approach employs Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements.
arXiv Detail & Related papers (2023-12-28T16:37:48Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Low-rank extended Kalman filtering for online learning of neural networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream. The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix. In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z)
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent [10.108468796986074]
We study the online statistical inference of model parameters in a contextual bandit framework of decision-making. We propose a general framework for online and adaptive data collection environment that can update decision rules via weighted gradient descent.
arXiv Detail & Related papers (2022-12-30T18:57:08Z)
Online Statistical Inference in Decision-Making with Matrix Context [5.2071564436846245]
We propose an online procedure to conduct statistical inference with adaptively collected data. Standard low-rank estimators are biased and cannot be obtained in a sequential manner. Existing approaches in sequential decision-making algorithms fail to account for the low-rankness and are also biased.
arXiv Detail & Related papers (2022-12-21T22:03:06Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
Fast and Robust Online Inference with Stochastic Gradient Descent via Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms. Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z)
Online Covariance Matrix Estimation in Stochastic Gradient Descent [10.153224593032677]
gradient descent (SGD) is widely used for parameter estimation especially for huge data sets and online learning. This paper aims at quantifying statistical inference of SGD-based estimates in an online setting.
arXiv Detail & Related papers (2020-02-10T17:46:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.