Related papers: Enhancing Parameter Control Policies with State Information

Enhancing Parameter Control Policies with State Information

URL: http://arxiv.org/abs/2507.08368v1
Date: Fri, 11 Jul 2025 07:31:48 GMT
Title: Enhancing Parameter Control Policies with State Information
Authors: Gianluca Covini, Denis Antipov, Carola Doerr,
Abstract summary: We propose four new benchmarks for which we derive optimal or close-to-optimal control policies.<n>We consider how additional information about the current state of the algorithm can help to make better choices of parameters.
Score: 0.44241702149260353
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parameter control and dynamic algorithm configuration study how to dynamically choose suitable configurations of a parametrized algorithm during the optimization process. Despite being an intensively researched topic in evolutionary computation, optimal control policies are known only for very few cases, limiting the development of automated approaches to achieve them. With this work we propose four new benchmarks for which we derive optimal or close-to-optimal control policies. More precisely, we consider the optimization of the \LeadingOnes function via RLS$_{k}$, a local search algorithm allowing for a dynamic choice of the mutation strength $k$. The benchmarks differ in which information the algorithm can exploit to set its parameters and to select offspring. In existing running time results, the exploitable information is typically limited to the quality of the current-best solution. In this work, we consider how additional information about the current state of the algorithm can help to make better choices of parameters, and how these choices affect the performance. Namely, we allow the algorithm to use information about the current \OneMax value, and we find that it allows much better parameter choices, especially in marginal states. Although those states are rarely visited by the algorithm, such policies yield a notable speed-up in terms of expected runtime. This makes the proposed benchmarks a challenging, but promising testing ground for analysis of parameter control methods in rich state spaces and of their ability to find optimal policies by catching the performance improvements yielded by correct parameter choices.

Related papers

Multi-parameter Control for the $(1+(λ,λ))$-GA on OneMax via Deep Reinforcement Learning [4.482691140663255]
We show how well state-of-the-art deep reinforcement learning techniques can approximate good control policies.<n>We derive a simple control policy that consistently outperforms the default theory-recommended setting.
arXiv Detail & Related papers (2025-05-19T11:18:41Z)
Using Automated Algorithm Configuration for Parameter Control [0.7742297876120562]
Dynamic Algorithm configuration (DAC) tackles the question of how to automatically learn policies to control parameters of algorithms in a data-driven fashion. We propose a new DAC benchmark the controlling of the key parameter $lambda$ in the $(lambda,lambda)$Genetic Algorithm for solving OneMax problems. Our approach is able to consistently outperform the default parameter control policy of the benchmark derived from previous theoretical work on sufficiently large problem sizes.
arXiv Detail & Related papers (2023-02-23T20:57:47Z)
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration [32.055812915031666]
We show how to compute optimal parameter portfolios of a given size. We extend this benchmark by analyzing optimal control policies that can select the parameters only from a given portfolio of possible values. We demonstrate the usefulness of our benchmarks by analyzing the behavior of the DDQN reinforcement learning approach for dynamic algorithm configuration.
arXiv Detail & Related papers (2022-02-07T15:00:30Z)
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs [113.8752163061151]
We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision processes (MDPs)<n>We propose the underlineperiodically underlinerestarted underlineoptimistic underlinepolicy underlineoptimization algorithm (PROPO)<n>PROPO features two mechanisms: sliding-window-based policy evaluation and periodic-restart-based policy improvement.
arXiv Detail & Related papers (2021-10-18T02:33:20Z)
Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z)
Lazy Parameter Tuning and Control: Choosing All Parameters Randomly From a Power-Law Distribution [8.34061303235504]
Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. We propose a lazy but effective solution, namely choosing all parameter values in each iteration randomly from a suitably scaled power-law distribution. We prove a performance guarantee that is comparable to the best performance known for static parameters.
arXiv Detail & Related papers (2021-04-14T09:17:18Z)
On the Optimality of Batch Policy Optimization Algorithms [106.89498352537682]
Batch policy optimization considers leveraging existing data for policy construction before interacting with an environment. We show that any confidence-adjusted index algorithm is minimax optimal, whether it be optimistic, pessimistic or neutral. We introduce a new weighted-minimax criterion that considers the inherent difficulty of optimal value prediction.
arXiv Detail & Related papers (2021-04-06T05:23:20Z)
Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG) Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem. Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z)
Leveraging Benchmarking Data for Informed One-Shot Dynamic Algorithm Selection [0.9281671380673306]
A key challenge in the application of evolutionary algorithms is the selection of an algorithm instance that best suits the problem at hand. We analyze in this work how such prior performance data can be used to infer informed dynamic algorithm selection schemes for the solution of pseudo-Boolean optimization problems.
arXiv Detail & Related papers (2021-02-12T12:27:02Z)
Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z)
Kalman meets Bellman: Improving Policy Evaluation through Value Tracking [59.691919635037216]
Policy evaluation is a key process in Reinforcement Learning (RL) We devise an optimization method, called Kalman Optimization for Value Approximation (KOVA) KOVA minimizes a regularized objective function that concerns both parameter and noisy return uncertainties.
arXiv Detail & Related papers (2020-02-17T13:30:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.