Statistical Inference for Online Decision Making via Stochastic Gradient
Descent
- URL: http://arxiv.org/abs/2010.07341v1
- Date: Wed, 14 Oct 2020 18:25:18 GMT
- Title: Statistical Inference for Online Decision Making via Stochastic Gradient
Descent
- Authors: Haoyu Chen, Wenbin Lu, Rui Song
- Abstract summary: We propose an online algorithm that can make decisions and update the decision rule online via gradient descent.
It is not only efficient but also supports all kinds of parametric reward models.
The proposed algorithm and theoretical results are tested by simulations and a real data application to news article recommendation.
- Score: 31.103438051597887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online decision making aims to learn the optimal decision rule by making
personalized decisions and updating the decision rule recursively. It has
become easier than before with the help of big data, but new challenges also
come along. Since the decision rule should be updated once per step, an offline
update which uses all the historical data is inefficient in computation and
storage. To this end, we propose a completely online algorithm that can make
decisions and update the decision rule online via stochastic gradient descent.
It is not only efficient but also supports all kinds of parametric reward
models. Focusing on the statistical inference of online decision making, we
establish the asymptotic normality of the parameter estimator produced by our
algorithm and the online inverse probability weighted value estimator we used
to estimate the optimal value. Online plugin estimators for the variance of the
parameter and value estimators are also provided and shown to be consistent, so
that interval estimation and hypothesis test are possible using our method. The
proposed algorithm and theoretical results are tested by simulations and a real
data application to news article recommendation.
Related papers
- Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Online Estimation via Offline Estimation: An Information-Theoretic Framework [75.80823630681323]
Motivated by connections between estimation and interactive decision making, we ask: is it possible to convert offline estimation algorithms into online estimation algorithms?
We introduce a new framework, Oracle-Efficient Online Estimation (OEOE), where the learner can only interact with the data stream indirectly through a sequence of offline estimators produced by a black-box algorithm operating on the stream.
arXiv Detail & Related papers (2024-04-15T20:19:18Z) - Online Tensor Inference [0.0]
Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for high-dimensional tensor data.
Existing low-rank tensor methods lack the capability for statistical inference in an online fashion.
Our approach employs Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements.
arXiv Detail & Related papers (2023-12-28T16:37:48Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Online Statistical Inference for Contextual Bandits via Stochastic
Gradient Descent [10.108468796986074]
We study the online statistical inference of model parameters in a contextual bandit framework of decision-making.
We propose a general framework for online and adaptive data collection environment that can update decision rules via weighted gradient descent.
arXiv Detail & Related papers (2022-12-30T18:57:08Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms.
Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z) - Online Covariance Matrix Estimation in Stochastic Gradient Descent [10.153224593032677]
gradient descent (SGD) is widely used for parameter estimation especially for huge data sets and online learning.
This paper aims at quantifying statistical inference of SGD-based estimates in an online setting.
arXiv Detail & Related papers (2020-02-10T17:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.