Generalizing Gaussian Smoothing for Random Search
- URL: http://arxiv.org/abs/2211.14721v1
- Date: Sun, 27 Nov 2022 04:42:05 GMT
- Title: Generalizing Gaussian Smoothing for Random Search
- Authors: Katelyn Gao and Ozan Sener
- Abstract summary: Gaussian smoothing (GS) is a derivative-free optimization algorithm that estimates the gradient of an objective using perturbations of the current benchmarks.
We propose to choose a distribution for perturbations that minimizes the error of such distributions with provably smaller MSE.
- Score: 23.381986209234164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian smoothing (GS) is a derivative-free optimization (DFO) algorithm
that estimates the gradient of an objective using perturbations of the current
parameters sampled from a standard normal distribution. We generalize it to
sampling perturbations from a larger family of distributions. Based on an
analysis of DFO for non-convex functions, we propose to choose a distribution
for perturbations that minimizes the mean squared error (MSE) of the gradient
estimate. We derive three such distributions with provably smaller MSE than
Gaussian smoothing. We conduct evaluations of the three sampling distributions
on linear regression, reinforcement learning, and DFO benchmarks in order to
validate our claims. Our proposal improves on GS with the same computational
complexity, and are usually competitive with and often outperform Guided ES and
Orthogonal ES, two computationally more expensive algorithms that adapt the
covariance matrix of normally distributed perturbations.
Related papers
- Effect of Random Learning Rate: Theoretical Analysis of SGD Dynamics in Non-Convex Optimization via Stationary Distribution [6.144680854063938]
We consider a variant of the gradient descent (SGD) with a random learning rate to reveal its convergence properties.
We demonstrate that a distribution of a parameter updated by Poisson SGD converges to a stationary distribution under weak assumptions.
arXiv Detail & Related papers (2024-06-23T06:52:33Z) - On the Computation of the Gaussian Rate-Distortion-Perception Function [10.564071872770146]
We study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source under mean squared error (MSE) distortion.
We provide the associated algorithmic realization, as well as the convergence and the rate of convergence characterization.
We corroborate our results with numerical simulations and draw connections to existing results.
arXiv Detail & Related papers (2023-11-15T18:34:03Z) - Noise-Free Sampling Algorithms via Regularized Wasserstein Proximals [3.4240632942024685]
We consider the problem of sampling from a distribution governed by a potential function.
This work proposes an explicit score based MCMC method that is deterministic, resulting in a deterministic evolution for particles.
arXiv Detail & Related papers (2023-08-28T23:51:33Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Gaussianization Flows [113.79542218282282]
We propose a new type of normalizing flow model that enables both efficient iteration of likelihoods and efficient inversion for sample generation.
Because of this guaranteed expressivity, they can capture multimodal target distributions without compromising the efficiency of sample generation.
arXiv Detail & Related papers (2020-03-04T08:15:06Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.