Proven Runtime Guarantees for How the MOEA/D Computes the Pareto Front From the Subproblem Solutions
- URL: http://arxiv.org/abs/2405.01014v2
- Date: Fri, 3 May 2024 05:55:45 GMT
- Title: Proven Runtime Guarantees for How the MOEA/D Computes the Pareto Front From the Subproblem Solutions
- Authors: Benjamin Doerr, Martin S. Krejca, NoƩ Weeks,
- Abstract summary: decomposition-based multi-objective evolutionary algorithm (MOEA/D) does not directly optimize a given multi-objective function $f$, but instead optimize $N + 1$ single-objective subproblems of $f$ in a co-evolutionary manner.
We analyze for the first time how the MOEA/D with only standard mutation operators computes the whole Pareto front of the OneMinMax benchmark.
Our overall bound for power-law suggests that the MOEA/D performs best for $N = O(nbeta - 1)$, resulting in an $O(n
- Score: 9.044970217182117
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The decomposition-based multi-objective evolutionary algorithm (MOEA/D) does not directly optimize a given multi-objective function $f$, but instead optimizes $N + 1$ single-objective subproblems of $f$ in a co-evolutionary manner. It maintains an archive of all non-dominated solutions found and outputs it as approximation to the Pareto front. Once the MOEA/D found all optima of the subproblems (the $g$-optima), it may still miss Pareto optima of $f$. The algorithm is then tasked to find the remaining Pareto optima directly by mutating the $g$-optima. In this work, we analyze for the first time how the MOEA/D with only standard mutation operators computes the whole Pareto front of the OneMinMax benchmark when the $g$-optima are a strict subset of the Pareto front. For standard bit mutation, we prove an expected runtime of $O(n N \log n + n^{n/(2N)} N \log n)$ function evaluations. Especially for the second, more interesting phase when the algorithm start with all $g$-optima, we prove an $\Omega(n^{(1/2)(n/N + 1)} \sqrt{N} 2^{-n/N})$ expected runtime. This runtime is super-polynomial if $N = o(n)$, since this leaves large gaps between the $g$-optima, which require costly mutations to cover. For power-law mutation with exponent $\beta \in (1, 2)$, we prove an expected runtime of $O\left(n N \log n + n^{\beta} \log n\right)$ function evaluations. The $O\left(n^{\beta} \log n\right)$ term stems from the second phase of starting with all $g$-optima, and it is independent of the number of subproblems $N$. This leads to a huge speedup compared to the lower bound for standard bit mutation. In general, our overall bound for power-law suggests that the MOEA/D performs best for $N = O(n^{\beta - 1})$, resulting in an $O(n^\beta \log n)$ bound. In contrast to standard bit mutation, smaller values of $N$ are better for power-law mutation, as it is capable of easily creating missing solutions.
Related papers
- A Formal Perspective on Byte-Pair Encoding [100.75374173565548]
Byte-Pair imation (BPE) is a popular algorithm used for tokenizing data in NLP, despite being devised initially as a compression method.
We provide a faster implementation of BPE which improves the runtime complexity from $mathcalOleft(N Mright)$ to $mathcalOleft(N log Mright)$, where $N$ is the sequence length and $M$ is the merge count.
arXiv Detail & Related papers (2023-06-29T10:29:23Z) - How the Move Acceptance Hyper-Heuristic Copes With Local Optima: Drastic
Differences Between Jumps and Cliffs [6.793248433673384]
We show that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal cliff benchmark with remarkable efficiency.
We also show that the MAHH with the global bit-wise mutation operator instead of the local one-bit operator optimize jump functions in time.
This suggests that combining several ways to cope with local optima can be a fruitful approach.
arXiv Detail & Related papers (2023-04-20T15:57:33Z) - Near-Optimal Non-Convex Stochastic Optimization under Generalized
Smoothness [21.865728815935665]
Two recent works established the $O(epsilon-3)$ sample complexity to obtain an $O(epsilon)$-stationary point.
However, both require a large batch size on the order of $mathrmploy(epsilon-1)$, which is not only computationally burdensome but also unsuitable for streaming applications.
In this work, we solve the prior two problems simultaneously by revisiting a simple variant of the STORM algorithm.
arXiv Detail & Related papers (2023-02-13T00:22:28Z) - Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly.
We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability.
Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z) - Self-adjusting Population Sizes for the $(1, \lambda)$-EA on Monotone
Functions [7.111443975103329]
We study the $(1,lambda)$-EA with mutation rate $c/n$ for $cle 1$, where the population size is adaptively controlled with the $(1:s+1)$-success rule.
We show that this setup with $c=1$ is efficient on onemax for $s1$, but inefficient if $s ge 18$.
arXiv Detail & Related papers (2022-04-01T15:46:12Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Optimal Mutation Rates for the $(1+\lambda)$ EA on OneMax [1.0965065178451106]
We extend the analysis of optimal mutation rates to two variants of the OneMax problem.
We compute for all population sizes $lambda in 2i mid 0 le i le 18$ which mutation rates minimize the expected running time.
Our results do not only provide a lower bound against which we can measure common evolutionary approaches.
arXiv Detail & Related papers (2020-06-20T01:23:14Z) - A Rigorous Runtime Analysis of the $(1 + (\lambda, \lambda))$ GA on Jump
Functions [8.34061303235504]
We conduct the first runtime analysis of this algorithm on a multimodal problem class, the jump functions benchmark.
For the isolated problem of leaving the local optimum of jump functions, we determine provably optimal parameters that lead to a runtime of $(n/k)k/2 eTheta(k)$.
arXiv Detail & Related papers (2020-04-14T17:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.