Semi-discrete optimization through semi-discrete optimal transport: a
framework for neural architecture search
- URL: http://arxiv.org/abs/2006.15221v2
- Date: Sun, 30 Jan 2022 21:34:32 GMT
- Title: Semi-discrete optimization through semi-discrete optimal transport: a
framework for neural architecture search
- Authors: Nicolas Garcia Trillos, Javier Morales
- Abstract summary: We introduce a theoretical framework for semi-discrete optimization using ideas from optimal transport.
Our primary motivation is in the field of deep learning, and specifically in the task of neural architecture search.
In a companion paper we show that algorithms inspired by our framework are competitive with contemporaneous methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we introduce a theoretical framework for semi-discrete
optimization using ideas from optimal transport. Our primary motivation is in
the field of deep learning, and specifically in the task of neural architecture
search. With this aim in mind, we discuss the geometric and theoretical
motivation for new techniques for neural architecture search (in a companion
paper we show that algorithms inspired by our framework are competitive with
contemporaneous methods). We introduce a Riemannian-like metric on the space of
probability measures over a semi-discrete space $\mathbb{R}^d \times
\mathcal{G}$ where $\mathcal{G}$ is a finite weighted graph. With such
Riemmanian structure in hand, we derive formal expressions for the gradient
flow of a relative entropy functional, as well as second order dynamics for the
optimization of said energy. Then, with the aim of providing a rigorous
motivation for the gradient flow equations derived formally, we also consider
an iterative procedure known as minimizing movement scheme (i.e., Implicit
Euler scheme, or JKO scheme) and apply it to the relative entropy with respect
to a suitable cost function. For some specific choices of metric and cost, we
rigorously show that the minimizing movement scheme of the relative entropy
functional converges to the gradient flow process provided by the formal
Riemannian structure. This flow coincides with a system of reaction-diffusion
equations on $\mathbb{R}^d$.
Related papers
- Simultaneous and Meshfree Topology Optimization with Physics-informed Gaussian Processes [0.0]
Topology optimization (TO) provides a principled mathematical approach for optimizing the performance of a structure by designing its material spatial distribution in a pre-defined domain and subject to a set of constraints.
We develop a new class of TO methods based on the framework of Gaussian processes (GPs) whose mean functions are parameterized via deep neural networks.
To test our method against conventional TO approaches implemented in commercial software, we evaluate it on four problems involving the minimization of dissipated power in Stokes flow.
arXiv Detail & Related papers (2024-08-07T01:01:35Z) - Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z) - Isotropic Gaussian Processes on Finite Spaces of Graphs [71.26737403006778]
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs.
We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon.
Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
arXiv Detail & Related papers (2022-11-03T10:18:17Z) - Manifold Free Riemannian Optimization [4.484251538832438]
A principled framework for solving optimization problems with a smooth manifold $mathcalM$ is proposed.
We use a noiseless sample set of the cost function $(x_i, y_i)in mathcalM times mathbbR$ and the intrinsic dimension of the manifold $mathcalM$.
arXiv Detail & Related papers (2022-09-07T16:19:06Z) - The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro.
Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold.
We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z) - Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems.
Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z) - A Contraction Theory Approach to Optimization Algorithms from
Acceleration Flows [1.90365714903665]
We use contraction theory to provide a principled methodology to design and discretize appropriate ODEs.
We propose a novel system of ODEs, namely the Accelerated-Contracting-Nesterov flow.
Remarkably, a simple explicit Euler discretization of this flow corresponds to the Nesterov acceleration method.
arXiv Detail & Related papers (2021-05-18T21:11:37Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.