U-Statistics for Importance-Weighted Variational Inference
- URL: http://arxiv.org/abs/2302.13918v1
- Date: Mon, 27 Feb 2023 16:08:43 GMT
- Title: U-Statistics for Importance-Weighted Variational Inference
- Authors: Javier Burroni, Kenta Takatsu, Justin Domke, Daniel Sheldon
- Abstract summary: We propose the use of U-statistics to reduce variance for estimation in importance-weighted variational inference.
We find empirically that U-statistic variance reduction can lead to modest to significant improvements in inference performance on a range of models.
- Score: 29.750633016889655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose the use of U-statistics to reduce variance for gradient estimation
in importance-weighted variational inference. The key observation is that,
given a base gradient estimator that requires $m > 1$ samples and a total of $n
> m$ samples to be used for estimation, lower variance is achieved by averaging
the base estimator on overlapping batches of size $m$ than disjoint batches, as
currently done. We use classical U-statistic theory to analyze the variance
reduction, and propose novel approximations with theoretical guarantees to
ensure computational efficiency. We find empirically that U-statistic variance
reduction can lead to modest to significant improvements in inference
performance on a range of models, with little computational cost.
Related papers
- STATE: A Robust ATE Estimator of Heavy-Tailed Metrics for Variance Reduction in Online Controlled Experiments [22.32661807469984]
We develop a novel framework that integrates the Student's t-distribution with machine learning tools to fit heavy-tailed metrics.
By adopting a variational EM method to optimize the loglikehood function, we can infer a robust solution that greatly eliminates the negative impact of outliers.
Both simulations on synthetic data and long-term empirical results on Meituan experiment platform demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-23T09:35:59Z) - A Correlation-induced Finite Difference Estimator [6.054123928890574]
We first provide a sample-driven method via the bootstrap technique to estimate the optimal perturbation, and then propose an efficient FD estimator based on correlated samples at the estimated optimal perturbation.
Numerical results confirm the efficiency of our estimators and align well with the theory presented, especially in scenarios with small sample sizes.
arXiv Detail & Related papers (2024-05-09T09:27:18Z) - Policy Gradient with Active Importance Sampling [55.112959067035916]
Policy gradient (PG) methods significantly benefit from IS, enabling the effective reuse of previously collected samples.
However, IS is employed in RL as a passive tool for re-weighting historical samples.
We look for the best behavioral policy from which to collect samples to reduce the policy gradient variance.
arXiv Detail & Related papers (2024-05-09T09:08:09Z) - Nearest Neighbor Sampling for Covariate Shift Adaptation [7.940293148084844]
We propose a new covariate shift adaptation method which avoids estimating the weights.
The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset.
Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy.
arXiv Detail & Related papers (2023-12-15T17:28:09Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Optimal Off-Policy Evaluation from Multiple Logging Policies [77.62012545592233]
We study off-policy evaluation from multiple logging policies, each generating a dataset of fixed size, i.e., stratified sampling.
We find the OPE estimator for multiple loggers with minimum variance for any instance, i.e., the efficient one.
arXiv Detail & Related papers (2020-10-21T13:43:48Z) - Optimal Variance Control of the Score Function Gradient Estimator for
Importance Weighted Bounds [12.75471887147565]
This paper introduces novel results for the score function gradient estimator of the importance weighted variational bound (IWAE)
We prove that in the limit of large $K$ one can choose the control variate such that the Signal-to-Noise ratio (SNR) of the estimator grows as $sqrtK$.
arXiv Detail & Related papers (2020-08-05T08:41:46Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.