Related papers: Semi-discrete optimization through semi-discrete optimal transport: a framework for neural architecture search

Semi-discrete optimization through semi-discrete optimal transport: a framework for neural architecture search

URL: http://arxiv.org/abs/2006.15221v2
Date: Sun, 30 Jan 2022 21:34:32 GMT
Title: Semi-discrete optimization through semi-discrete optimal transport: a framework for neural architecture search
Authors: Nicolas Garcia Trillos, Javier Morales
Abstract summary: We introduce a theoretical framework for semi-discrete optimization using ideas from optimal transport. Our primary motivation is in the field of deep learning, and specifically in the task of neural architecture search. In a companion paper we show that algorithms inspired by our framework are competitive with contemporaneous methods.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we introduce a theoretical framework for semi-discrete optimization using ideas from optimal transport. Our primary motivation is in the field of deep learning, and specifically in the task of neural architecture search. With this aim in mind, we discuss the geometric and theoretical motivation for new techniques for neural architecture search (in a companion paper we show that algorithms inspired by our framework are competitive with contemporaneous methods). We introduce a Riemannian-like metric on the space of probability measures over a semi-discrete space $\mathbb{R}^d \times \mathcal{G}$ where $\mathcal{G}$ is a finite weighted graph. With such Riemmanian structure in hand, we derive formal expressions for the gradient flow of a relative entropy functional, as well as second order dynamics for the optimization of said energy. Then, with the aim of providing a rigorous motivation for the gradient flow equations derived formally, we also consider an iterative procedure known as minimizing movement scheme (i.e., Implicit Euler scheme, or JKO scheme) and apply it to the relative entropy with respect to a suitable cost function. For some specific choices of metric and cost, we rigorously show that the minimizing movement scheme of the relative entropy functional converges to the gradient flow process provided by the formal Riemannian structure. This flow coincides with a system of reaction-diffusion equations on $\mathbb{R}^d$.

Related papers

Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation [55.88862563823878]
In this work, we present an original algorithm to coarsen an unstructured grid based on the concepts of differentiable physics.<n>We demonstrate performance of the algorithm on two PDEs: a linear equation which governs slightly compressible fluid flow in porous media and the wave equation.<n>Our results show that in the considered scenarios, we reduced the number of grid points up to 10 times while preserving the modeled variable dynamics in the points of interest.
arXiv Detail & Related papers (2025-07-24T11:02:13Z)
Simultaneous and Meshfree Topology Optimization with Physics-informed Gaussian Processes [0.0]
Topology optimization (TO) provides a principled mathematical approach for optimizing the performance of a structure by designing its material spatial distribution in a pre-defined domain and subject to a set of constraints. We develop a new class of TO methods based on the framework of Gaussian processes (GPs) whose mean functions are parameterized via deep neural networks. To test our method against conventional TO approaches implemented in commercial software, we evaluate it on four problems involving the minimization of dissipated power in Stokes flow.
arXiv Detail & Related papers (2024-08-07T01:01:35Z)
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method. It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z)
Isotropic Gaussian Processes on Finite Spaces of Graphs [71.26737403006778]
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs. We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon. Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
arXiv Detail & Related papers (2022-11-03T10:18:17Z)
Manifold Free Riemannian Optimization [4.484251538832438]
A principled framework for solving optimization problems with a smooth manifold $mathcalM$ is proposed. We use a noiseless sample set of the cost function $(x_i, y_i)in mathcalM times mathbbR$ and the intrinsic dimension of the manifold $mathcalM$.
arXiv Detail & Related papers (2022-09-07T16:19:06Z)
The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro. Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold. We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z)
Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z)
A Contraction Theory Approach to Optimization Algorithms from Acceleration Flows [1.90365714903665]
We use contraction theory to provide a principled methodology to design and discretize appropriate ODEs. We propose a novel system of ODEs, namely the Accelerated-Contracting-Nesterov flow. Remarkably, a simple explicit Euler discretization of this flow corresponds to the Nesterov acceleration method.
arXiv Detail & Related papers (2021-05-18T21:11:37Z)
On The Convergence of Euler Discretization of Finite-Time Convergent Gradient Flows [4.401622714202886]
We investigate the performance of two novel first-order optimization algorithms, namely the rescaled-gradient flow (RGF) and the signed-gradient flow (SGF) These algorithms are derived from the forward discretization of finite-time convergent flows, comprised of non-Lipschitz dynamical systems, which locally converge to the minima of gradient-linear functions.
arXiv Detail & Related papers (2020-10-06T19:28:00Z)
Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$. We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs) We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.