Convex Optimization with an Interpolation-based Projection and its
Application to Deep Learning
- URL: http://arxiv.org/abs/2011.07016v1
- Date: Fri, 13 Nov 2020 16:52:50 GMT
- Title: Convex Optimization with an Interpolation-based Projection and its
Application to Deep Learning
- Authors: Riad Akrour, Asma Atamna, Jan Peters
- Abstract summary: We investigate whether an inexact, but cheaper projection can drive a descent algorithm to an optimum.
Specifically, we propose an inexact-based projection that is computationally cheap and easy to compute given a convex, domain defining, function.
- Score: 36.19092177858517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convex optimizers have known many applications as differentiable layers
within deep neural architectures. One application of these convex layers is to
project points into a convex set. However, both forward and backward passes of
these convex layers are significantly more expensive to compute than those of a
typical neural network. We investigate in this paper whether an inexact, but
cheaper projection, can drive a descent algorithm to an optimum. Specifically,
we propose an interpolation-based projection that is computationally cheap and
easy to compute given a convex, domain defining, function. We then propose an
optimization algorithm that follows the gradient of the composition of the
objective and the projection and prove its convergence for linear objectives
and arbitrary convex and Lipschitz domain defining inequality constraints. In
addition to the theoretical contributions, we demonstrate empirically the
practical interest of the interpolation projection when used in conjunction
with neural networks in a reinforcement learning and a supervised learning
setting.
Related papers
- The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - DFWLayer: Differentiable Frank-Wolfe Optimization Layer [33.20550212188619]
Differentiable optimization has received a significant amount of attention due to its foundational role in the domain of machine learning based on neural networks.
This paper proposes a differentiable layer, named Differentiable Frank-Wolfe Layer (DFWLayer), by rolling out the Frank-Wolfe method.
Experimental results demonstrate that the DFWLayer not only attains competitive accuracy in solutions and gradients but also consistently adheres to constraints.
arXiv Detail & Related papers (2023-08-21T15:53:38Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Delayed Projection Techniques for Linearly Constrained Problems:
Convergence Rates, Acceleration, and Applications [24.763531954075656]
We study a novel class of projection-based algorithms for linearly constrained problems (LCPs)
We propose the delayed projection technique that calls a projection once for a while, lowering the projection frequency and improving the projection efficiency.
We show that it is feasible to improve projection efficiency in both strongly convex and generally convex cases.
arXiv Detail & Related papers (2021-01-05T13:42:41Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - Projection-Free Adaptive Gradients for Large-Scale Optimization [22.0439695290991]
Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens by querying only approximate first-order information from the objective.
We show that our method can improve the performance of adaptive algorithms for constrained optimization.
arXiv Detail & Related papers (2020-09-29T15:56:12Z) - The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural
Networks: an Exact Characterization of the Optimal Solutions [51.60996023961886]
We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints.
Our analysis is novel, characterizes all optimal solutions, and does not leverage duality-based analysis which was recently used to lift neural network training into convex spaces.
arXiv Detail & Related papers (2020-06-10T15:38:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.