Related papers: Leveraging Memory Effects and Gradient Information in Consensus-Based Optimization: On Global Convergence in Mean-Field Law

Leveraging Memory Effects and Gradient Information in Consensus-Based Optimization: On Global Convergence in Mean-Field Law

URL: http://arxiv.org/abs/2211.12184v2
Date: Fri, 15 Sep 2023 17:14:37 GMT
Title: Leveraging Memory Effects and Gradient Information in Consensus-Based Optimization: On Global Convergence in Mean-Field Law
Authors: Konstantin Riedl
Abstract summary: We present a versatile, flexible and customizable consensus-based optimization (CBO) method suitable for global and non-smooth optimizations in high dimensions. We prove that dynamics converges to a global minimizer of the objective function in mean-field law. We present evidence for the superiority of this CBO variant in applications such as machine learning and compressed sensingant widen the scope of applications of CBO.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we study consensus-based optimization (CBO), a versatile, flexible and customizable optimization method suitable for performing nonconvex and nonsmooth global optimizations in high dimensions. CBO is a multi-particle metaheuristic, which is effective in various applications and at the same time amenable to theoretical analysis thanks to its minimalistic design. The underlying dynamics, however, is flexible enough to incorporate different mechanisms widely used in evolutionary computation and machine learning, as we show by analyzing a variant of CBO which makes use of memory effects and gradient information. We rigorously prove that this dynamics converges to a global minimizer of the objective function in mean-field law for a vast class of functions under minimal assumptions on the initialization of the method. The proof in particular reveals how to leverage further, in some applications advantageous, forces in the dynamics without loosing provable global convergence. To demonstrate the benefit of the herein investigated memory effects and gradient information in certain applications, we present numerical evidence for the superiority of this CBO variant in applications such as machine learning and compressed sensing, which en passant widen the scope of applications of CBO.

Related papers

Extending Multi-Source Bayesian Optimization With Causality Principles [0.0]
We propose a principled integration of the MSBO and CBO methodologies in the multi-source domain.<n>We show how their synergy informs our Multi-Source Causal Bayesian Optimization (MSCBO) algorithm.
arXiv Detail & Related papers (2026-02-16T14:38:16Z)
Integrating Multi-Armed Bandit, Active Learning, and Distributed Computing for Scalable Optimization [0.0]
ALMAB-DC is a unified framework for scalable black-box optimization.<n>It integrates active learning, multi-armed bandits, and distributed computing.<n>It consistently outperforms state-of-the-art black-box evaluations.
arXiv Detail & Related papers (2026-01-02T09:06:56Z)
More Than Memory Savings: Zeroth-Order Optimization Mitigates Forgetting in Continual Learning [10.698225972251839]
Zeroth-order (ZO) optimization has gained attention as a memory-efficient alternative to first-order (FO) methods.<n>We show that ZO optimization naturally leads to flatter loss landscapes, which in turn reduce forgetting in continuous learning.<n>This stability comes at a cost of plasticity: due to its imprecise gradient estimates and slower convergence, ZO optimization tends to be less effective than FO in acquiring new task-specific knowledge.<n>We propose ZO-FC, a simple but effective approach that applies ZO optimization to a single adapter-based PEFT module with FO optimized classifier.
arXiv Detail & Related papers (2025-10-23T21:54:00Z)
Projective Kolmogorov Arnold Neural Networks (P-KANs): Entropy-Driven Functional Space Discovery for Interpretable Machine Learning [0.0]
Kolmogorov-Arnold Networks (KANs) relocate learnable nonlinearities from nodes to edges.<n>Current KANs suffer from fundamental inefficiencies due to redundancy in high-dimensional spline parameter spaces.<n>We introduce Projective Kolmogorov-Arnold Networks (P-KANs), a novel training framework that guides edge function discovery.
arXiv Detail & Related papers (2025-09-24T12:15:37Z)
Adaptive Linear Embedding for Nonstationary High-Dimensional Optimization [0.0]
Self-Adaptive embedding REMBO (SA-REMBO) is a novel framework that generalizes Random EMbedding Bayesian Optimization (REMBO) to support multiple random Gaussian embeddings.<n>An index variable governs the embedding choice and is jointly modeled with the latent latent via a product kernel in a surrogate.<n>We empirically demonstrate the advantage of our method across synthetic and real-world high-dimensional benchmarks, where traditional REMBO and other low-rank BO methods fail.
arXiv Detail & Related papers (2025-05-16T14:18:19Z)
A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand. We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions [7.674715791336311]
We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse function-on-function regression problem. We show how to extend it to the scalar-on-function framework. We present an application to brain fMRI data from the AOMIC PIOP1 study.
arXiv Detail & Related papers (2023-03-26T19:41:17Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
AGGLIO: Global Optimization for Locally Convex Functions [5.221860952360943]
This paper presents AGG (Accelerated Optimization Generalized LInear-model) a stage-wise, global technique that offers provable convergence problems. AGG can be readily implemented using point as A-batch SGD updates and offers provable convergence as well as convergent experiments.
arXiv Detail & Related papers (2021-11-06T18:15:56Z)
Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models. PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems. In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z)
Optimization-Inspired Learning with Architecture Augmentations and Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles. We construct three propagative modules to effectively solve the optimization models with flexible combinations. Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z)
Efficient Learning of Generative Models via Finite-Difference Score Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference. Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z)
Causal Bayesian Optimization [8.958125394444679]
We study the problem of globally optimizing a variable of interest that is part of a causal model in which a sequence of interventions can be performed. Our approach combines ideas from causal inference, uncertainty quantification and sequential decision making. We show how knowing the causal graph significantly improves the ability to reason about optimal decision making strategies.
arXiv Detail & Related papers (2020-05-24T13:20:50Z)
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence [30.393999722555154]
We propose a variant of the classical Polyak step-size (Polyak, 1987) commonly used in the subgradient method. The proposed Polyak step-size (SPS) is an attractive choice for setting the learning rate for gradient descent.
arXiv Detail & Related papers (2020-02-24T20:57:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.