Related papers: Learning Complexity of Simulated Annealing

Learning Complexity of Simulated Annealing

URL: http://arxiv.org/abs/2003.02981v2
Date: Mon, 29 Jun 2020 21:17:37 GMT
Title: Learning Complexity of Simulated Annealing
Authors: Avrim Blum, Chen Dan, Saeed Seddighin
Abstract summary: We study the criteria under which the temperature changes namely, the cooling schedule. We provide positive results both in terms of sample complexity and simulation complexity. We present time algorithms with provable guarantees for the learning problem.
Score: 23.678288517578764
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Simulated annealing is an effective and general means of optimization. It is in fact inspired by metallurgy, where the temperature of a material determines its behavior in thermodynamics. Likewise, in simulated annealing, the actions that the algorithm takes depend entirely on the value of a variable which captures the notion of temperature. Typically, simulated annealing starts with a high temperature, which makes the algorithm pretty unpredictable, and gradually cools the temperature down to become more stable. A key component that plays a crucial role in the performance of simulated annealing is the criteria under which the temperature changes namely, the cooling schedule. Motivated by this, we study the following question in this work: "Given enough samples to the instances of a specific class of optimization problems, can we design optimal (or approximately optimal) cooling schedules that minimize the runtime or maximize the success rate of the algorithm on average when the underlying problem is drawn uniformly at random from the same class?" We provide positive results both in terms of sample complexity and simulation complexity. For sample complexity, we show that $\tilde O(\sqrt{m})$ samples suffice to find an approximately optimal cooling schedule of length $m$. We complement this result by giving a lower bound of $\tilde \Omega(m^{1/3})$ on the sample complexity of any learning algorithm that provides an almost optimal cooling schedule. These results are general and rely on no assumption. For simulation complexity, however, we make additional assumptions to measure the success rate of an algorithm. To this end, we introduce the monotone stationary graph that models the performance of simulated annealing. Based on this model, we present polynomial time algorithms with provable guarantees for the learning problem.

Related papers

Quantum Simulation-Based Optimization of a Cooling System [0.0]
Quantum algorithms promise up to exponential speedups for specific tasks relevant to numerical simulations. However, these advantages quickly vanish when considering data input and output on quantum computers. The recently introduced Quantum Simulation-Based Optimization (QuSO) treats simulations as subproblems within a larger optimization.
arXiv Detail & Related papers (2025-04-21T21:58:21Z)
Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal. Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization [51.41175648612714]
We propose a new robust contrastive loss inspired by distributionally robust optimization (DRO) We show that our algorithm automatically learns a suitable $tau$ for each sample. Our method outperforms prior strong baselines on unimodal and bimodal datasets.
arXiv Detail & Related papers (2023-05-19T19:25:56Z)
Uniform Generalization Bound on Time and Inverse Temperature for Gradient Descent Algorithm and its Application to Analysis of Simulated Annealing [0.0]
We propose a novel uniform generalization bound on the time and inverse temperature for gradient Langevin dynamics. As an application generalization, an evaluation on the effectiveness is presented.
arXiv Detail & Related papers (2022-05-25T01:21:27Z)
An Algebraically Converging Stochastic Gradient Descent Algorithm for Global Optimization [14.336473214524663]
A key component in the algorithm is the randomness based on the value of the objective function. We prove the convergence of the algorithm with an algebra and tuning in the parameter space. We present several numerical examples to demonstrate the efficiency and robustness of the algorithm.
arXiv Detail & Related papers (2022-04-12T16:27:49Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
Adaptive Sequential SAA for Solving Two-stage Stochastic Linear Programs [1.6181085766811525]
We present adaptive sequential SAA (sample average approximation) algorithms to solve large-scale two-stage linear programs. The proposed algorithm can be stopped in finite-time to return a solution endowed with a probabilistic guarantee on quality.
arXiv Detail & Related papers (2020-12-07T14:58:16Z)
A Simulated Annealing Algorithm for Joint Stratification and Sample Allocation Designs [0.0]
This study combines simulated annealing with delta evaluation to solve the joint stratification and sample allocation problem. In this problem, atomic strata are partitioned into mutually exclusive and collectively exhaustive strata. The Bell number of possible solutions is enormous, for even a moderate number of atomic strata, and an additional layer of complexity is added with the evaluation time of each solution.
arXiv Detail & Related papers (2020-11-25T20:27:49Z)
Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function. We consider a randomized approximation of the projected gradient descent algorithm. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing [23.87373187143897]
Deep learning applications require global optimization of non- objective functions, which have multiple local minima. We show that a coordinate learning algorithm can be used to resolve the same problem in physical simulations.
arXiv Detail & Related papers (2020-05-29T14:44:24Z)
Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP) We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs) We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.