Leveraging Memory Effects and Gradient Information in Consensus-Based
Optimization: On Global Convergence in Mean-Field Law
- URL: http://arxiv.org/abs/2211.12184v2
- Date: Fri, 15 Sep 2023 17:14:37 GMT
- Title: Leveraging Memory Effects and Gradient Information in Consensus-Based
Optimization: On Global Convergence in Mean-Field Law
- Authors: Konstantin Riedl
- Abstract summary: We present a versatile, flexible and customizable consensus-based optimization (CBO) method suitable for global and non-smooth optimizations in high dimensions.
We prove that dynamics converges to a global minimizer of the objective function in mean-field law.
We present evidence for the superiority of this CBO variant in applications such as machine learning and compressed sensingant widen the scope of applications of CBO.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we study consensus-based optimization (CBO), a versatile,
flexible and customizable optimization method suitable for performing nonconvex
and nonsmooth global optimizations in high dimensions. CBO is a multi-particle
metaheuristic, which is effective in various applications and at the same time
amenable to theoretical analysis thanks to its minimalistic design. The
underlying dynamics, however, is flexible enough to incorporate different
mechanisms widely used in evolutionary computation and machine learning, as we
show by analyzing a variant of CBO which makes use of memory effects and
gradient information. We rigorously prove that this dynamics converges to a
global minimizer of the objective function in mean-field law for a vast class
of functions under minimal assumptions on the initialization of the method. The
proof in particular reveals how to leverage further, in some applications
advantageous, forces in the dynamics without loosing provable global
convergence. To demonstrate the benefit of the herein investigated memory
effects and gradient information in certain applications, we present numerical
evidence for the superiority of this CBO variant in applications such as
machine learning and compressed sensing, which en passant widen the scope of
applications of CBO.
Related papers
- MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - AGGLIO: Global Optimization for Locally Convex Functions [5.221860952360943]
This paper presents AGG (Accelerated Optimization Generalized LInear-model) a stage-wise, global technique that offers provable convergence problems.
AGG can be readily implemented using point as A-batch SGD updates and offers provable convergence as well as convergent experiments.
arXiv Detail & Related papers (2021-11-06T18:15:56Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - Optimization-Inspired Learning with Architecture Augmentations and
Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles.
We construct three propagative modules to effectively solve the optimization models with flexible combinations.
Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z) - Efficient Learning of Generative Models via Finite-Difference Score
Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference.
Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z) - Multi-Fidelity Bayesian Optimization via Deep Neural Networks [19.699020509495437]
In many applications, the objective function can be evaluated at multiple fidelities to enable a trade-off between the cost and accuracy.
We propose Deep Neural Network Multi-Fidelity Bayesian Optimization (DNN-MFBO) that can flexibly capture all kinds of complicated relationships between the fidelities.
We show the advantages of our method in both synthetic benchmark datasets and real-world applications in engineering design.
arXiv Detail & Related papers (2020-07-06T23:28:40Z) - Causal Bayesian Optimization [8.958125394444679]
We study the problem of globally optimizing a variable of interest that is part of a causal model in which a sequence of interventions can be performed.
Our approach combines ideas from causal inference, uncertainty quantification and sequential decision making.
We show how knowing the causal graph significantly improves the ability to reason about optimal decision making strategies.
arXiv Detail & Related papers (2020-05-24T13:20:50Z) - Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
Convergence [30.393999722555154]
We propose a variant of the classical Polyak step-size (Polyak, 1987) commonly used in the subgradient method.
The proposed Polyak step-size (SPS) is an attractive choice for setting the learning rate for gradient descent.
arXiv Detail & Related papers (2020-02-24T20:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.