Related papers: A Dynamical System View of Langevin-Based Non-Convex Sampling

A Dynamical System View of Langevin-Based Non-Convex Sampling

URL: http://arxiv.org/abs/2210.13867v1
Date: Tue, 25 Oct 2022 09:43:36 GMT
Title: A Dynamical System View of Langevin-Based Non-Convex Sampling
Authors: Mohammad Reza Karimi, Ya-Ping Hsieh, Andreas Krause
Abstract summary: Non- sampling is a key challenge in machine learning, central to non-rate optimization in deep learning as well as to approximate its significance. Existing guarantees typically only hold for the averaged distances rather than the more desirable last-rate iterates. We develop a new framework that lifts the above issues by harnessing several tools from the theory systems.
Score: 84.61544861851907
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain many important challenges: Existing guarantees (1) typically only hold for the averaged iterates rather than the more desirable last iterates, (2) lack convergence metrics that capture the scales of the variables such as Wasserstein distances, and (3) mainly apply to elementary schemes such as stochastic gradient Langevin dynamics. In this paper, we develop a new framework that lifts the above issues by harnessing several tools from the theory of dynamical systems. Our key result is that, for a large class of state-of-the-art sampling schemes, their last-iterate convergence in Wasserstein distances can be reduced to the study of their continuous-time counterparts, which is much better understood. Coupled with standard assumptions of MCMC sampling, our theory immediately yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators. Beyond existing methods, our framework also motivates more efficient schemes that enjoy the same rigorous guarantees.

Related papers

Beyond Discreteness: Finite-Sample Analysis of Straight-Through Estimator for Quantization [2.8948274245812327]
This work presents the first finite-sample analysis of the straight-through estimator (STE) in the context of neural network quantization.<n>Specifically, by analyzing the quantization-aware training of a two-layer neural network with binary weights and activations, we derive the sample complexity bound in terms of the data dimensionality.<n>In the presence of label noises, we uncover an intriguing recurrence property of STE-gradient method, where the iterate repeatedly escape from and return to the optimal binary weights.
arXiv Detail & Related papers (2025-05-23T17:11:22Z)
Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring [4.910232130159309]
We propose a novel random Bernoulli principal component analysis method to efficiently capture nonlinear patterns in the process. We design four fast process monitoring methods based on random Bernoulli principal component analysis to extend its nonlinear capabilities.
arXiv Detail & Related papers (2025-03-16T11:20:27Z)
A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction. It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation. Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z)
Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows [10.153270126742369]
We study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications.
arXiv Detail & Related papers (2024-06-25T04:07:22Z)
Non-Stationary Long-Term Dynamics via Selected Incomplete Dual Bases [0.0]
We propose an SU(2) coherent state basis and deriving equations of motion for both time-independent and time-dependent Hamiltonian. We evaluate this method through numerical simulations of a seven-qubit system. Our conclusion suggests that the selected incomplete dual basis method can efficiently capture both short-term and long-term dynamics.
arXiv Detail & Related papers (2023-06-12T20:21:29Z)
PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium [58.26573117273626]
We consider the non-AL equilibrium nonconptotic objective function in two-player zero-sum continuous games. Our novel insights into the particle-based algorithms for continuous distribution strategies are presented.
arXiv Detail & Related papers (2023-03-02T05:08:15Z)
Quantum open system identification via global optimization: Optimally accurate Markovian models of open systems from time-series data [2.0971479389679333]
We show how to identify quantum systems using open data. We show that optimization using moment/sum-of-squares approaches can provide accurate damping system.
arXiv Detail & Related papers (2022-03-31T16:38:08Z)
Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Nonlinear Two-Time-Scale Stochastic Approximation: Convergence and Finite-Time Performance [1.52292571922932]
We study the convergence and finite-time analysis of the nonlinear two-time-scale approximation. In particular, we show that the method achieves a convergence in expectation at a rate $mathcalO (1/k2/3)$, where $k$ is the number of iterations.
arXiv Detail & Related papers (2020-11-03T17:43:39Z)
Achieving fast high-fidelity optimal control of many-body quantum dynamics [0.0]
We demonstrate the efficiency of a recent exact-gradient optimal control methodology by applying it to a challenging many-body problem. We observe fidelities in the range 0.99-0.9999 with associated minimal process duration estimates. Overall, the comparison suggests significant methodological improvements also for many-body systems in the ideal open-loop setting.
arXiv Detail & Related papers (2020-08-13T18:30:24Z)
Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z)
On dissipative symplectic integration with applications to gradient-based optimization [77.34726150561087]
We propose a geometric framework in which discretizations can be realized systematically. We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
arXiv Detail & Related papers (2020-04-15T00:36:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.