Related papers: The Hessian Estimation Evolution Strategy

The Hessian Estimation Evolution Strategy

URL: http://arxiv.org/abs/2003.13256v2
Date: Tue, 9 Jun 2020 07:30:53 GMT
Title: The Hessian Estimation Evolution Strategy
Authors: Tobias Glasmachers, Oswin Krause
Abstract summary: We present a novel black box optimization algorithm called Hessian Estimation Evolution Strategy. The algorithm updates the covariance matrix of its sampling distribution by directly estimating the curvature of the objective function.
Score: 3.756550107432323
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a novel black box optimization algorithm called Hessian Estimation Evolution Strategy. The algorithm updates the covariance matrix of its sampling distribution by directly estimating the curvature of the objective function. This algorithm design is targeted at twice continuously differentiable problems. For this, we extend the cumulative step-size adaptation algorithm of the CMA-ES to mirrored sampling. We demonstrate that our approach to covariance matrix adaptation is efficient by evaluation it on the BBOB/COCO testbed. We also show that the algorithm is surprisingly robust when its core assumption of a twice continuously differentiable objective function is violated. The approach yields a new evolution strategy with competitive performance, and at the same time it also offers an interesting alternative to the usual covariance matrix update mechanism.

Related papers

KOALA++: Efficient Kalman-Based Optimization of Neural Networks with Gradient-Covariance Products [19.802128119541077]
KOALA++ is a scalable Kalman-based optimization algorithm for neural network training.<n>It explicitly models structured uncertainty in neural network training.<n>It achieves accuracy on par or better than state-of-the-art first-order methods.
arXiv Detail & Related papers (2025-06-04T20:33:06Z)
A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent [57.64826450787237]
We show how to analyze the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions. We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm. Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
arXiv Detail & Related papers (2024-07-19T08:29:12Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables. We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption. We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z)
Distributed Evolution Strategies for Black-box Stochastic Optimization [42.90600124972943]
This work concerns the evolutionary approaches to distributed black-box optimization. Each worker can individually solve an approximation of the problem with algorithms. We propose two alternative simulation schemes which significantly improve robustness of problems.
arXiv Detail & Related papers (2022-04-09T11:18:41Z)
Accelerating Stochastic Probabilistic Inference [1.599072005190786]
Variational Inference (SVI) has been increasingly attractive thanks to its ability to find good posterior approximations of probabilistic models. Almost all the state-of-the-art SVI algorithms are based on first-order optimization and often suffer from poor convergence rate. We bridge the gap between second-order methods and variational inference by proposing a second-order based variational inference approach.
arXiv Detail & Related papers (2022-03-15T01:19:12Z)
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks. We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution. We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z)
Covariance Matrix Adaptation Evolution Strategy Assisted by Principal Component Analysis [4.658166900129066]
We will use the dimensionality reduction method Principal component analysis (PCA) to reduce the dimension during the iteration of Covariance Matrix Adaptation Evolution Strategy (CMA-ES)
arXiv Detail & Related papers (2021-05-08T12:43:38Z)
Variance Reduction on Adaptive Stochastic Mirror Descent [23.451652399157002]
We prove that variance reduction reduces SFO complexity of most adaptive mirror descent algorithms and accelerates their convergence. We check the validity of our claims using experiments in deep learning.
arXiv Detail & Related papers (2020-12-26T15:15:51Z)
Convergence Analysis of the Hessian Estimation Evolution Strategy [3.756550107432323]
Hessian Estimation Evolution Strategies (HE-ESs) update the covariance matrix of their sampling distribution by directly estimating the curvature of the objective function. We prove two strong guarantees for the (1+4)-HE-ES, a minimal elitist member of the family.
arXiv Detail & Related papers (2020-09-06T13:34:25Z)
Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large. Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z)
Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory. Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.