Finding Global Minima via Kernel Approximations
- URL: http://arxiv.org/abs/2012.11978v1
- Date: Tue, 22 Dec 2020 12:59:30 GMT
- Title: Finding Global Minima via Kernel Approximations
- Authors: Alessandro Rudi and Ulysse Marteau-Ferey and Francis Bach
- Abstract summary: We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
- Score: 90.42048080064849
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We consider the global minimization of smooth functions based solely on
function evaluations. Algorithms that achieve the optimal number of function
evaluations for a given precision level typically rely on explicitly
constructing an approximation of the function which is then minimized with
algorithms that have exponential running-time complexity. In this paper, we
consider an approach that jointly models the function to approximate and finds
a global minimum. This is done by using infinite sums of square smooth
functions and has strong links with polynomial sum-of-squares hierarchies.
Leveraging recent representation properties of reproducing kernel Hilbert
spaces, the infinite-dimensional optimization problem can be solved by
subsampling in time polynomial in the number of function evaluations, and with
theoretical guarantees on the obtained minimum.
Given $n$ samples, the computational cost is $O(n^{3.5})$ in time, $O(n^2)$
in space, and we achieve a convergence rate to the global optimum that is
$O(n^{-m/d + 1/2 + 3/d})$ where $m$ is the degree of differentiability of the
function and $d$ the number of dimensions. The rate is nearly optimal in the
case of Sobolev functions and more generally makes the proposed method
particularly suitable for functions that have a large number of derivatives.
Indeed, when $m$ is in the order of $d$, the convergence rate to the global
optimum does not suffer from the curse of dimensionality, which affects only
the worst-case constants (that we track explicitly through the paper).
Related papers
- Adaptive approximation of monotone functions [0.0]
We prove that GreedyBox achieves an optimal sample complexity for any function $f$, up to logarithmic factors.
Perhaps as expected, the $Lp(mu)$ error of GreedyBox decreases much faster for piecewise-$C2$ functions than predicted by the algorithm.
arXiv Detail & Related papers (2023-09-14T08:56:31Z) - Convergence Rates for Non-Log-Concave Sampling and Log-Partition
Estimation [0.0]
It is known that for $m$times differentiable functions in $d$, the optimal rate for algorithms with $n$(m/d) is to be $n(m/d).
We show that similar rates for sampling and computation are possible, and whether they can be realized in time with an independent rate of $d$.
arXiv Detail & Related papers (2023-03-06T15:53:44Z) - Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm.
Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z) - Lower Bounds on the Worst-Case Complexity of Efficient Global
Optimization [11.523746174066702]
We derive a unified lower bound for the complexity of efficient global optimization in terms of the metric entropy of a ball in its corresponding reproducing kernel Hilbert space(RKHS)
We show that this lower bound nearly matches the upper bound attained by non-adaptive search algorithms for the commonly used squared exponential kernel and the Mat'ern kernel.
arXiv Detail & Related papers (2022-09-20T11:57:13Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Low Regret Binary Sampling Method for Efficient Global Optimization of
Univariate Functions [0.0]
We study the cumulative regret of the algorithm instead of the simple regret between our best query and the optimal value of the objective function.
Although our approach has similar regret results with the traditional lower-bounding algorithms, it has a major computational cost advantage.
arXiv Detail & Related papers (2022-01-18T18:11:48Z) - Parallel Quasi-concave set optimization: A new frontier that scales
without needing submodularity [14.93584434176082]
Class of quasi-concave set functions induced as a dual class to monotone linkage functions.
We show a potential for widespread applications via an example of diverse feature subset selection with exact global maxi-min guarantees.
arXiv Detail & Related papers (2021-08-19T15:50:41Z) - Submodular + Concave [53.208470310734825]
It has been well established that first order optimization methods can converge to the maximal objective value of concave functions.
In this work, we initiate the determinant of the smooth functions convex body $$F(x) = G(x) +C(x)$.
This class of functions is an extension of both concave and continuous DR-submodular functions for which no guarantee is known.
arXiv Detail & Related papers (2021-06-09T01:59:55Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Reinforcement Learning with General Value Function Approximation:
Provably Efficient Approach via Bounded Eluder Dimension [124.7752517531109]
We establish a provably efficient reinforcement learning algorithm with general value function approximation.
We show that our algorithm achieves a regret bound of $widetildeO(mathrmpoly(dH)sqrtT)$ where $d$ is a complexity measure.
Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment.
arXiv Detail & Related papers (2020-05-21T17:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.