RicciNets: Curvature-guided Pruning of High-performance Neural Networks
Using Ricci Flow
- URL: http://arxiv.org/abs/2007.04216v1
- Date: Wed, 8 Jul 2020 15:56:02 GMT
- Title: RicciNets: Curvature-guided Pruning of High-performance Neural Networks
Using Ricci Flow
- Authors: Samuel Glass, Simeon Spasov, Pietro Li\`o
- Abstract summary: We use the definition of Ricci curvature to remove edges of low importance before mapping the computational graph to a neural network.
We show a reduction of almost $35%$ in the number of floating-point operations (FLOPs) per pass, with no degradation in performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A novel method to identify salient computational paths within randomly wired
neural networks before training is proposed. The computational graph is pruned
based on a node mass probability function defined by local graph measures and
weighted by hyperparameters produced by a reinforcement learning-based
controller neural network. We use the definition of Ricci curvature to remove
edges of low importance before mapping the computational graph to a neural
network. We show a reduction of almost $35\%$ in the number of floating-point
operations (FLOPs) per pass, with no degradation in performance. Further, our
method can successfully regularize randomly wired neural networks based on
purely structural properties, and also find that the favourable characteristics
identified in one network generalise to other networks. The method produces
networks with better performance under similar compression to those pruned by
lowest-magnitude weights. To our best knowledge, this is the first work on
pruning randomly wired neural networks, as well as the first to utilize the
topological measure of Ricci curvature in the pruning mechanism.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Pruning a neural network using Bayesian inference [1.776746672434207]
Neural network pruning is a highly effective technique aimed at reducing the computational and memory demands of large neural networks.
We present a novel approach to pruning neural networks utilizing Bayesian inference, which can seamlessly integrate into the training procedure.
arXiv Detail & Related papers (2023-08-04T16:34:06Z) - Neural Network Pruning as Spectrum Preserving Process [7.386663473785839]
We identify the close connection between matrix spectrum learning and neural network training for dense and convolutional layers.
We propose a matrix sparsification algorithm tailored for neural network pruning that yields better pruning result.
arXiv Detail & Related papers (2023-07-18T05:39:32Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - A Derivation of Feedforward Neural Network Gradients Using Fr\'echet
Calculus [0.0]
We show a derivation of the gradients of feedforward neural networks using Fr'teche calculus.
We show how our analysis generalizes to more general neural network architectures including, but not limited to, convolutional networks.
arXiv Detail & Related papers (2022-09-27T08:14:00Z) - Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency.
Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Neuron-based Pruning of Deep Neural Networks with Better Generalization
using Kronecker Factored Curvature Approximation [18.224344440110862]
The proposed algorithm directs the parameters of the compressed model toward a flatter solution by exploring the spectral radius of Hessian.
Our result shows that it improves the state-of-the-art results on neuron compression.
The method is able to achieve very small networks with small accuracy across different neural network models.
arXiv Detail & Related papers (2021-11-16T15:55:59Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.