Self-adjusting Population Sizes for the $(1, \lambda)$-EA on Monotone
Functions
- URL: http://arxiv.org/abs/2204.00531v2
- Date: Wed, 12 Jul 2023 10:58:24 GMT
- Title: Self-adjusting Population Sizes for the $(1, \lambda)$-EA on Monotone
Functions
- Authors: Marc Kaufmann, Maxime Larcher, Johannes Lengler, Xun Zou
- Abstract summary: We study the $(1,lambda)$-EA with mutation rate $c/n$ for $cle 1$, where the population size is adaptively controlled with the $(1:s+1)$-success rule.
We show that this setup with $c=1$ is efficient on onemax for $s1$, but inefficient if $s ge 18$.
- Score: 7.111443975103329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the $(1,\lambda)$-EA with mutation rate $c/n$ for $c\le 1$, where
the population size is adaptively controlled with the $(1:s+1)$-success rule.
Recently, Hevia Fajardo and Sudholt have shown that this setup with $c=1$ is
efficient on \onemax for $s<1$, but inefficient if $s \ge 18$. Surprisingly,
the hardest part is not close to the optimum, but rather at linear distance. We
show that this behavior is not specific to \onemax. If $s$ is small, then the
algorithm is efficient on all monotone functions, and if $s$ is large, then it
needs superpolynomial time on all monotone functions. In the former case, for
$c<1$ we show a $O(n)$ upper bound for the number of generations and $O(n\log
n)$ for the number of function evaluations, and for $c=1$ we show $O(n\log n)$
generations and $O(n^2\log\log n)$ evaluations. We also show formally that
optimization is always fast, regardless of $s$, if the algorithm starts in
proximity of the optimum. All results also hold in a dynamic environment where
the fitness function changes in each generation.
Related papers
- LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions [54.54897832889028]
We show that for any $K$, there is a universal set" $U subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_i,j$ in row $i$ of $A$ all have $jin U$.
We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well.
arXiv Detail & Related papers (2024-10-07T19:47:13Z) - Proven Runtime Guarantees for How the MOEA/D Computes the Pareto Front From the Subproblem Solutions [9.044970217182117]
decomposition-based multi-objective evolutionary algorithm (MOEA/D) does not directly optimize a given multi-objective function $f$, but instead optimize $N + 1$ single-objective subproblems of $f$ in a co-evolutionary manner.
We analyze for the first time how the MOEA/D with only standard mutation operators computes the whole Pareto front of the OneMinMax benchmark.
Our overall bound for power-law suggests that the MOEA/D performs best for $N = O(nbeta - 1)$, resulting in an $O(n
arXiv Detail & Related papers (2024-05-02T05:32:19Z) - Dueling Optimization with a Monotone Adversary [35.850072415395395]
We study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization.
The goal is to design an online algorithm to find a minimizer $mathbfx*$ for a function $fcolon X to mathbbRd.
arXiv Detail & Related papers (2023-11-18T23:55:59Z) - A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and
Minimizing the Maximum of Smooth Functions [44.655316553524855]
We design algorithms for minimizing $max_iin[n] f_i(x) over a $d$-dimensional Euclidean or simplex domain.
When each $f_i$ is $1$-Lipschitz and $1$-smooth, our method computes an $epsilon-approximate solution.
arXiv Detail & Related papers (2023-11-17T22:07:18Z) - Adaptive approximation of monotone functions [0.0]
We prove that GreedyBox achieves an optimal sample complexity for any function $f$, up to logarithmic factors.
Perhaps as expected, the $Lp(mu)$ error of GreedyBox decreases much faster for piecewise-$C2$ functions than predicted by the algorithm.
arXiv Detail & Related papers (2023-09-14T08:56:31Z) - An Optimal Algorithm for Strongly Convex Min-min Optimization [79.11017157526815]
Existing optimal first-order methods require $mathcalO(sqrtmaxkappa_x,kappa_y log 1/epsilon)$ of computations of both $nabla_x f(x,y)$ and $nabla_y f(x,y)$.
We propose a new algorithm that only requires $mathcalO(sqrtkappa_x log 1/epsilon)$ of computations of $nabla_x f(x,
arXiv Detail & Related papers (2022-12-29T19:26:12Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Optimal Mutation Rates for the $(1+\lambda)$ EA on OneMax [1.0965065178451106]
We extend the analysis of optimal mutation rates to two variants of the OneMax problem.
We compute for all population sizes $lambda in 2i mid 0 le i le 18$ which mutation rates minimize the expected running time.
Our results do not only provide a lower bound against which we can measure common evolutionary approaches.
arXiv Detail & Related papers (2020-06-20T01:23:14Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.