Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms
- URL: http://arxiv.org/abs/2405.04480v2
- Date: Sat, 11 May 2024 00:41:19 GMT
- Title: Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms
- Authors: Per Kristian Lehre, Shishen Lin,
- Abstract summary: This paper considers the problem of deriving concentration tail-bounds on the runtime/regret of algorithms.
It provides a novel drift theorem that gives precise exponential tail-bounds given positive, weak, zero and even negative drift.
Our drift theorem can be used to prove a strong concentration of the runtime/regret of algorithms in AI.
- Score: 1.104960878651584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Runtime analysis, as a branch of the theory of AI, studies how the number of iterations algorithms take before finding a solution (its runtime) depends on the design of the algorithm and the problem structure. Drift analysis is a state-of-the-art tool for estimating the runtime of randomised algorithms, such as evolutionary and bandit algorithms. Drift refers roughly to the expected progress towards the optimum per iteration. This paper considers the problem of deriving concentration tail-bounds on the runtime/regret of algorithms. It provides a novel drift theorem that gives precise exponential tail-bounds given positive, weak, zero and even negative drift. Previously, such exponential tail bounds were missing in the case of weak, zero, or negative drift. Our drift theorem can be used to prove a strong concentration of the runtime/regret of algorithms in AI. For example, we prove that the regret of the \rwab bandit algorithm is highly concentrated, while previous analyses only considered the expected regret. This means that the algorithm obtains the optimum within a given time frame with high probability, i.e. a form of algorithm reliability. Moreover, our theorem implies that the time needed by the co-evolutionary algorithm RLS-PD to obtain a Nash equilibrium in a \bilinear max-min-benchmark problem is highly concentrated. However, we also prove that the algorithm forgets the Nash equilibrium, and the time until this occurs is highly concentrated. This highlights a weakness in the RLS-PD which should be addressed by future work.
Related papers
- Agnostic Learning of Arbitrary ReLU Activation under Gaussian Marginals [10.850101961203752]
We consider the problem of learning an arbitrarily-biased ReLU activation (or neuron) over Gaussian marginals with the squared loss objective.
Our main result is a time statistical query (SQ) algorithm that gives the first constant factor approximation for arbitrary bias.
Our algorithm presents an interesting departure from existing algorithms, which are all based on gradient descent.
arXiv Detail & Related papers (2024-11-21T17:43:51Z) - A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent [57.64826450787237]
We show how to analyze the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions.
We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm.
Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
arXiv Detail & Related papers (2024-07-19T08:29:12Z) - Decentralized Sum-of-Nonconvex Optimization [42.04181488477227]
We consider the optimization problem of the sum-of-non function, i.e., a guarantee function that is the average non-consensus number.
We propose an accelerated decentralized first-order algorithm by techniques of gradient, tracking into the rate, and multi-consensus.
arXiv Detail & Related papers (2024-02-04T05:48:45Z) - Online Robust Reinforcement Learning with Model Uncertainty [24.892994430374912]
We develop a sample-based approach to estimate the unknown uncertainty set and design a robust Q-learning and robust TDC algorithm.
For the robust Q-learning algorithm, we prove that it converges to the optimal robust Q function, and for the robust TDC algorithm, we prove that it converges to some stationary points.
Our approach can be readily extended to robustify many other algorithms, e.g., TD, SARSA, and other GTD algorithms.
arXiv Detail & Related papers (2021-09-29T16:17:47Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - On the Last Iterate Convergence of Momentum Methods [32.60494006038902]
We prove for the first time that for any constant momentum factor, there exists a Lipschitz and convex function for which the last iterate suffers from an error.
We show that in the setting with convex and smooth functions, our new SGDM algorithm automatically converges at a rate of $O(fraclog TT)$.
arXiv Detail & Related papers (2021-02-13T21:16:16Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.