Asymptotic convergence rates for averaging strategies
- URL: http://arxiv.org/abs/2108.04707v1
- Date: Tue, 10 Aug 2021 14:09:46 GMT
- Title: Asymptotic convergence rates for averaging strategies
- Authors: Laurent Meunier, Iskander Legheraba, Yann Chevaleyre, Olivier Teytaud
- Abstract summary: Parallel quadratic black box optimization consists in estimating the optimum of a function using $lambda$ parallel evaluations of $f$.
Averaging the $mu$ best individuals among the $lambda$ evaluations is known to provide better estimates of the optimum of a function than just picking up the best.
- Score: 10.639022684335293
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parallel black box optimization consists in estimating the optimum of a
function using $\lambda$ parallel evaluations of $f$. Averaging the $\mu$ best
individuals among the $\lambda$ evaluations is known to provide better
estimates of the optimum of a function than just picking up the best. In
continuous domains, this averaging is typically just based on (possibly
weighted) arithmetic means. Previous theoretical results were based on
quadratic objective functions. In this paper, we extend the results to a wide
class of functions, containing three times continuously differentiable
functions with unique optimum. We prove formal rate of convergences and show
they are indeed better than pure random search asymptotically in $\lambda$. We
validate our theoretical findings with experiments on some standard black box
functions.
Related papers
- Faster Convergence with Multiway Preferences [99.68922143784306]
We consider the sign-function-based comparison feedback model and analyze the convergence rates with batched and multiway comparisons.
Our work is the first to study the problem of convex optimization with multiway preferences and analyze the optimal convergence rates.
arXiv Detail & Related papers (2023-12-19T01:52:13Z) - Zero-Order One-Point Estimate with Distributed Stochastic
Gradient-Tracking Technique [23.63073074337495]
We consider a distributed multi-agent optimization problem where each agent holds a local objective function that is smooth and convex.
We extend the distributed gradient-tracking method to the bandit setting where we don't have an estimate of the gradient.
We analyze the convergence of this novel technique for smooth and convex objectives using approximation tools.
arXiv Detail & Related papers (2022-10-11T17:04:45Z) - Understanding the Effect of Stochasticity in Policy Optimization [86.7574122154668]
We show that the preferability of optimization methods depends critically on whether exact gradients are used.
Second, to explain these findings we introduce the concept of committal rate for policy optimization.
Third, we show that in the absence of external oracle information, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely.
arXiv Detail & Related papers (2021-10-29T06:35:44Z) - Submodular + Concave [53.208470310734825]
It has been well established that first order optimization methods can converge to the maximal objective value of concave functions.
In this work, we initiate the determinant of the smooth functions convex body $$F(x) = G(x) +C(x)$.
This class of functions is an extension of both concave and continuous DR-submodular functions for which no guarantee is known.
arXiv Detail & Related papers (2021-06-09T01:59:55Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Regret Bounds for Gaussian-Process Optimization in Large Domains [40.92207267407271]
We provide upper bounds on the suboptimality (Bayesian simple regret) of the solution found by optimization strategies.
These regret bounds illuminate the relationship between the number of evaluations, the domain size, and the optimality of the retrieved function value.
In particular, they show that even when the number of evaluations is far too small to find the global optimum, we can find nontrivial function values.
arXiv Detail & Related papers (2021-04-29T05:19:03Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z) - What do you Mean? The Role of the Mean Function in Bayesian Optimisation [0.03305438525880806]
We show that the rate of convergence can depend sensitively on the choice of mean function.
We find that for design dimensions $ge5$ using a constant mean function equal to the worst observed quality value is consistently the best choice.
arXiv Detail & Related papers (2020-04-17T17:10:17Z) - Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function.
One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation.
In this paper, we represent the prior knowledge about the function optimum through a prior distribution.
The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z) - Efficient Rollout Strategies for Bayesian Optimization [15.050692645517998]
Most acquisition functions are myopic, meaning that they only consider the impact of the next function evaluation.
We show that a combination of quasi-Monte Carlo, common random numbers, and control variables significantly reduce the computational burden of rollout.
We then formulate a policy-search based approach that removes the need to optimize the rollout acquisition function.
arXiv Detail & Related papers (2020-02-24T20:54:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.