Related papers: Hyper-Heuristics Can Profit From Global Variation Operators

Hyper-Heuristics Can Profit From Global Variation Operators

URL: http://arxiv.org/abs/2407.14237v1
Date: Fri, 19 Jul 2024 12:10:05 GMT
Title: Hyper-Heuristics Can Profit From Global Variation Operators
Authors: Benjamin Doerr, Johannes F. Lutzeyer,
Abstract summary: We show that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal CLIFF benchmark with remarkable efficiency. We also show that replacing the local one-bit mutation operator in the MAHH with the global bit-wise mutation operator, commonly used in EAs, yields a runtime of $min1, O(fraceln(n)m)m, O(nm)$ on JUMP functions.
Score: 12.774575491521926
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent work, Lissovoi, Oliveto, and Warwicker (Artificial Intelligence (2023)) proved that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal CLIFF benchmark with remarkable efficiency. The $O(n^3)$ runtime of the MAHH, for almost all cliff widths $d\ge 2,$ is significantly better than the $\Theta(n^d)$ runtime of simple elitist evolutionary algorithms (EAs) on CLIFF. In this work, we first show that this advantage is specific to the CLIFF problem and does not extend to the JUMP benchmark, the most prominent multi-modal benchmark in the theory of randomized search heuristics. We prove that for any choice of the MAHH selection parameter $p$, the expected runtime of the MAHH on a JUMP function with gap size $m = O(n^{1/2})$ is at least $\Omega(n^{2m-1} / (2m-1)!)$. This is significantly slower than the $O(n^m)$ runtime of simple elitist EAs. Encouragingly, we also show that replacing the local one-bit mutation operator in the MAHH with the global bit-wise mutation operator, commonly used in EAs, yields a runtime of $\min\{1, O(\frac{e\ln(n)}{m})^m\} \, O(n^m)$ on JUMP functions. This is at least as good as the runtime of simple elitist EAs. For larger values of $m$, this result proves an asymptotic performance gain over simple EAs. As our proofs reveal, the MAHH profits from its ability to walk through the valley of lower objective values in moderate-size steps, always accepting inferior solutions. This is the first time that such an optimization behavior is proven via mathematical means. Generally, our result shows that combining two ways of coping with local optima, global mutation and accepting inferior solutions, can lead to considerable performance gains.

Related papers

How the Move Acceptance Hyper-Heuristic Copes With Local Optima: Drastic Differences Between Jumps and Cliffs [6.793248433673384]
We show that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal cliff benchmark with remarkable efficiency. We also show that the MAHH with the global bit-wise mutation operator instead of the local one-bit operator optimize jump functions in time. This suggests that combining several ways to cope with local optima can be a fruitful approach.
arXiv Detail & Related papers (2023-04-20T15:57:33Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations. This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation. We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z)
An Extended Jump Function Benchmark for the Analysis of Randomized Search Heuristics [4.38301148531795]
We propose an extended class $textscJump_k,delta$ of jump functions that contain a valley of low fitness of width. We show that the new class allows experiments with wider fitness valleys, especially when they lie further away from the global optimum.
arXiv Detail & Related papers (2021-05-07T07:21:10Z)
When Non-Elitism Meets Time-Linkage Problems [19.798298260257432]
We analyze on the influence of the non-elitism via comparing the performance of the elitist (1+$lambda$)EA and its non-elitist counterpart (1,$lambda$)EA. We prove that with probability $1$, (1,$lambda$)EA can reach the global optimum and its expected runtime is $O(n3+clog n)$ with $lambda=c log_fracee-1 n$ for the constant $cge 1$.
arXiv Detail & Related papers (2021-04-14T13:03:42Z)
Provably Breaking the Quadratic Error Compounding Barrier in Imitation Learning, Optimally [58.463668865380946]
We study the statistical limits of Imitation Learning in episodic Markov Decision Processes (MDPs) with a state space $mathcalS$. We establish an upper bound $O(|mathcalS|H3/2/N)$ for the suboptimality using the Mimic-MD algorithm in Rajaraman et al ( 2020) We show the minimax suboptimality grows as $Omega( H3/2/N)$ when $mathcalS|geq 3$ while the unknown-transition setting suffers from a larger sharp rate
arXiv Detail & Related papers (2021-02-25T15:50:19Z)
Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations. In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z)
Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization [133.53164856723782]
We propose a new accelerated zeroth-order momentum (AccZOM) method for black-box mini-optimization where only function values can be obtained. Meanwhile, we propose an accelerated zeroth-order momentum descent (Acc-MDA) method for black-box minimax optimization, where only function values can be obtained. In particular, our Acc-MDA can obtain a lower gradient complexity of $tildeO(kappa_y2.5epsilon-3)$ with a batch size $O(kappa_y4)$.
arXiv Detail & Related papers (2020-08-18T22:19:29Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
Optimal Mutation Rates for the $(1+\lambda)$ EA on OneMax [1.0965065178451106]
We extend the analysis of optimal mutation rates to two variants of the OneMax problem. We compute for all population sizes $lambda in 2i mid 0 le i le 18$ which mutation rates minimize the expected running time. Our results do not only provide a lower bound against which we can measure common evolutionary approaches.
arXiv Detail & Related papers (2020-06-20T01:23:14Z)
Does Comma Selection Help To Cope With Local Optima [9.853329403413701]
We show that the $(mu,lambda)$EA does not lead to a runtime advantage over the $(mu+lambda)$EA. This is the first runtime result for a non-elitist algorithm on a multi-modal problem that is tight apart from lower order terms.
arXiv Detail & Related papers (2020-04-02T21:39:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.