Learning Gradients of Convex Functions with Monotone Gradient Networks
- URL: http://arxiv.org/abs/2301.10862v1
- Date: Wed, 25 Jan 2023 23:04:50 GMT
- Title: Learning Gradients of Convex Functions with Monotone Gradient Networks
- Authors: Shreyas Chaudhari, Srinivasa Pranav, Jos\'e M. F. Moura
- Abstract summary: gradients of convex functions have critical applications ranging from gradient-based optimization to optimal transport.
Recent works have explored data-driven methods for learning convex objectives, but learning their monotone gradients is seldom studied.
We show that our networks are simpler to train, learn monotone gradient fields more accurately, and use significantly fewer parameters than state of the art methods.
- Score: 5.220940151628734
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While much effort has been devoted to deriving and studying effective convex
formulations of signal processing problems, the gradients of convex functions
also have critical applications ranging from gradient-based optimization to
optimal transport. Recent works have explored data-driven methods for learning
convex objectives, but learning their monotone gradients is seldom studied. In
this work, we propose Cascaded and Modular Monotone Gradient Networks (C-MGN
and M-MGN respectively), two monotone gradient neural network architectures for
directly learning the gradients of convex functions. We show that our networks
are simpler to train, learn monotone gradient fields more accurately, and use
significantly fewer parameters than state of the art methods. We further
demonstrate their ability to learn optimal transport mappings to augment
driving image data.
Related papers
- Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [4.313136216120379]
In this paper, we provide convergence analysis for the implicit gradient descent for training over-parametrized two-layer PINNs.
We show that the randomly IGD converges a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z) - Gradient Networks [11.930694410868435]
This paper introduces networks (GradNets) that neuralize of various function classes.
We provide a GradNet design framework that includes methods transforming GradNets into gradients of convex functions.
We show that these networks offer efficient parameterizations and outperform popular methods in field learning tasks.
arXiv Detail & Related papers (2024-04-10T21:36:59Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - Can we learn gradients by Hamiltonian Neural Networks? [68.8204255655161]
We propose a meta-learner based on ODE neural networks that learns gradients.
We demonstrate that our method outperforms a meta-learner based on LSTM for an artificial task and the MNIST dataset with ReLU activations in the optimizee.
arXiv Detail & Related papers (2021-10-31T18:35:10Z) - Efficient Differentiable Simulation of Articulated Bodies [89.64118042429287]
We present a method for efficient differentiable simulation of articulated bodies.
This enables integration of articulated body dynamics into deep learning frameworks.
We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method.
arXiv Detail & Related papers (2021-09-16T04:48:13Z) - Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem.
CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint.
It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z) - Channel-Directed Gradients for Optimization of Convolutional Neural
Networks [50.34913837546743]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error.
We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental.
arXiv Detail & Related papers (2020-08-25T00:44:09Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.