Related papers: Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization

Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization

URL: http://arxiv.org/abs/2206.01991v1
Date: Sat, 4 Jun 2022 13:28:44 GMT
Title: Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization
Authors: Takashi Goda, Wataru Kitade
Abstract summary: We study gradient descent for solving conditional optimization problems. The gradient of such a parametric nested expectation is again expressed as a nested expectation. We show under some conditions that a multilevel Monte Carlo gradient estimator is unbiased.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study stochastic gradient descent for solving conditional stochastic optimization problems, in which an objective to be minimized is given by a parametric nested expectation with an outer expectation taken with respect to one random variable and an inner conditional expectation with respect to the other random variable. The gradient of such a parametric nested expectation is again expressed as a nested expectation, which makes it hard for the standard nested Monte Carlo estimator to be unbiased. In this paper, we show under some conditions that a multilevel Monte Carlo gradient estimator is unbiased and has finite variance and finite expected computational cost, so that the standard theory from stochastic optimization for a parametric (non-nested) expectation directly applies. We also discuss a special case for which yet another unbiased gradient estimator with finite variance and cost can be constructed.

Related papers

Learning Parametric Distributions from Samples and Preferences [19.879505582147807]
We show that preference-based M-estimators achieve a better variance than sample-only M-estimators.<n>We propose an estimator achieving an estimation error scaling of $mathcalO (1/n)$ -- a significant improvement over the $Theta (1/sqrtn)$ rate attainable with samples alone.
arXiv Detail & Related papers (2025-05-29T15:33:43Z)
Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate. We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z)
Leveraging Nested MLMC for Sequential Neural Posterior Estimation with Intractable Likelihoods [0.8287206589886881]
SNPE techniques are proposed for dealing with simulation-based models with intractable likelihoods. In this paper, we propose a nested APT method to estimate the involved nested expectation. Since the nested estimators for the loss function and its gradient are biased, we make use of unbiased multi-level Monte Carlo (MLMC) estimators.
arXiv Detail & Related papers (2024-01-30T06:29:41Z)
Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients [0.8749675983608172]
We present an unbiased method for posterior means based on kinetic Langevin dynamics. Our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem. Our results demonstrate that in large-scale applications, the unbiased algorithm we present can be 2-3 orders of magnitude more efficient than the gold-standard" randomized Hamiltonian Monte Carlo.
arXiv Detail & Related papers (2023-11-08T21:19:52Z)
Scalable method for Bayesian experimental design without integrating over posterior distribution [0.0]
We address the computational efficiency in solving the A-optimal Bayesian design of experiments problems. A-optimality is a widely used and easy-to-interpret criterion for Bayesian experimental design. This study presents a novel likelihood-free approach to the A-optimal experimental design.
arXiv Detail & Related papers (2023-06-30T12:40:43Z)
Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples. We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails. We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z)
Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic Bounds and Applications [0.6445605125467573]
gradient estimation is of crucial importance in statistics and learning theory. We consider here the classic regression setup, where a real valued square integrable r.v. $Y$ is to be predicted. We prove nonasymptotic bounds improving upon those obtained for alternative estimation methods.
arXiv Detail & Related papers (2020-06-26T15:19:43Z)
Unbiased MLMC stochastic gradient-based optimization of Bayesian experimental designs [4.112293524466434]
The gradient of the expected information gain with respect to experimental design parameters is given by a nested expectation. We introduce an unbiased Monte Carlo estimator for the gradient of the expected information gain with finite expected squared $ell$-norm and finite expected computational cost per sample.
arXiv Detail & Related papers (2020-05-18T01:02:31Z)
SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series. We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
Estimating Gradients for Discrete Random Variables by Sampling without Replacement [93.09326095997336]
We derive an unbiased estimator for expectations over discrete random variables based on sampling without replacement. We show that our estimator can be derived as the Rao-Blackwellization of three different estimators.
arXiv Detail & Related papers (2020-02-14T14:15:18Z)
Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples. A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set. We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.