Gradient Networks
- URL: http://arxiv.org/abs/2404.07361v3
- Date: Sat, 25 Jan 2025 02:05:28 GMT
- Title: Gradient Networks
- Authors: Shreyas Chaudhari, Srinivasa Pranav, José M. F. Moura,
- Abstract summary: We provide a comprehensive GradNet design framework to represent convex gradients.<n>We show that GradNets can approximate neural gradient functions.<n>We also show that monotone GradNets provide efficient parameterizations and outperform existing methods.
- Score: 11.930694410868435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Directly parameterizing and learning gradients of functions has widespread significance, with specific applications in inverse problems, generative modeling, and optimal transport. This paper introduces gradient networks (GradNets): novel neural network architectures that parameterize gradients of various function classes. GradNets exhibit specialized architectural constraints that ensure correspondence to gradient functions. We provide a comprehensive GradNet design framework that includes methods for transforming GradNets into monotone gradient networks (mGradNets), which are guaranteed to represent gradients of convex functions. Our results establish that our proposed GradNet (and mGradNet) universally approximate the gradients of (convex) functions. Furthermore, these networks can be customized to correspond to specific spaces of potential functions, including transformed sums of (convex) ridge functions. Our analysis leads to two distinct GradNet architectures, GradNet-C and GradNet-M, and we describe the corresponding monotone versions, mGradNet-C and mGradNet-M. Our empirical results demonstrate that these architectures provide efficient parameterizations and outperform existing methods by up to 15 dB in gradient field tasks and by up to 11 dB in Hamiltonian dynamics learning tasks.
Related papers
- GradMetaNet: An Equivariant Architecture for Learning on Gradients [18.350495600116712]
We introduce GradMetaNet, a novel architecture for learning on gradients.<n>We prove results for GradMetaNet, and show that previous approaches cannot approximate natural gradient-based functions.<n>We then demonstrate GradMetaNet's effectiveness on a diverse set of gradient-based tasks.
arXiv Detail & Related papers (2025-07-02T12:22:39Z) - How to guess a gradient [68.98681202222664]
We show that gradients are more structured than previously thought.
Exploiting this structure can significantly improve gradient-free optimization schemes.
We highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients.
arXiv Detail & Related papers (2023-12-07T21:40:44Z) - Learning Gradients of Convex Functions with Monotone Gradient Networks [5.220940151628734]
gradients of convex functions have critical applications ranging from gradient-based optimization to optimal transport.
Recent works have explored data-driven methods for learning convex objectives, but learning their monotone gradients is seldom studied.
We show that our networks are simpler to train, learn monotone gradient fields more accurately, and use significantly fewer parameters than state of the art methods.
arXiv Detail & Related papers (2023-01-25T23:04:50Z) - Gradient Gating for Deep Multi-Rate Learning on Graphs [62.25886489571097]
We present Gradient Gating (G$2$), a novel framework for improving the performance of Graph Neural Networks (GNNs)
Our framework is based on gating the output of GNN layers with a mechanism for multi-rate flow of message passing information across nodes of the underlying graph.
arXiv Detail & Related papers (2022-10-02T13:19:48Z) - Gradient Correction beyond Gradient Descent [63.33439072360198]
gradient correction is apparently the most crucial aspect for the training of a neural network.
We introduce a framework (textbfGCGD) to perform gradient correction.
Experiment results show that our gradient correction framework can effectively improve the gradient quality to reduce training epochs by $sim$ 20% and also improve the network performance.
arXiv Detail & Related papers (2022-03-16T01:42:25Z) - Proxy Convexity: A Unified Framework for the Analysis of Neural Networks
Trained by Gradient Descent [95.94432031144716]
We propose a unified non- optimization framework for the analysis of a learning network.
We show that existing guarantees can be trained unified through gradient descent.
arXiv Detail & Related papers (2021-06-25T17:45:00Z) - Exploiting Adam-like Optimization Algorithms to Improve the Performance
of Convolutional Neural Networks [82.61182037130405]
gradient descent (SGD) is the main approach for training deep networks.
In this work, we compare Adam based variants based on the difference between the present and the past gradients.
We have tested ensemble of networks and the fusion with ResNet50 trained with gradient descent.
arXiv Detail & Related papers (2021-03-26T18:55:08Z) - Channel-Directed Gradients for Optimization of Convolutional Neural
Networks [50.34913837546743]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error.
We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental.
arXiv Detail & Related papers (2020-08-25T00:44:09Z) - Gradients as Features for Deep Representation Learning [26.996104074384263]
We address the problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks.
Our key innovation is the design of a linear model that incorporates both gradient and activation of the pre-trained network.
We present an efficient algorithm for the training and inference of our model without computing the actual gradient.
arXiv Detail & Related papers (2020-04-12T02:57:28Z) - Gradient Boosting Neural Networks: GrowNet [9.0491536808974]
A novel gradient boosting framework is proposed where shallow neural networks are employed as weak learners''
A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision tree.
The proposed model rendered outperforming results against state-of-the-art boosting methods in all three tasks on multiple datasets.
arXiv Detail & Related papers (2020-02-19T03:02:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.