Related papers: RicciNets: Curvature-guided Pruning of High-performance Neural Networks Using Ricci Flow

RicciNets: Curvature-guided Pruning of High-performance Neural Networks Using Ricci Flow

URL: http://arxiv.org/abs/2007.04216v1
Date: Wed, 8 Jul 2020 15:56:02 GMT
Title: RicciNets: Curvature-guided Pruning of High-performance Neural Networks Using Ricci Flow
Authors: Samuel Glass, Simeon Spasov, Pietro Li\`o
Abstract summary: We use the definition of Ricci curvature to remove edges of low importance before mapping the computational graph to a neural network. We show a reduction of almost $35%$ in the number of floating-point operations (FLOPs) per pass, with no degradation in performance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A novel method to identify salient computational paths within randomly wired neural networks before training is proposed. The computational graph is pruned based on a node mass probability function defined by local graph measures and weighted by hyperparameters produced by a reinforcement learning-based controller neural network. We use the definition of Ricci curvature to remove edges of low importance before mapping the computational graph to a neural network. We show a reduction of almost $35\%$ in the number of floating-point operations (FLOPs) per pass, with no degradation in performance. Further, our method can successfully regularize randomly wired neural networks based on purely structural properties, and also find that the favourable characteristics identified in one network generalise to other networks. The method produces networks with better performance under similar compression to those pruned by lowest-magnitude weights. To our best knowledge, this is the first work on pruning randomly wired neural networks, as well as the first to utilize the topological measure of Ricci curvature in the pruning mechanism.

Related papers

Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Pruning a neural network using Bayesian inference [1.776746672434207]
Neural network pruning is a highly effective technique aimed at reducing the computational and memory demands of large neural networks. We present a novel approach to pruning neural networks utilizing Bayesian inference, which can seamlessly integrate into the training procedure.
arXiv Detail & Related papers (2023-08-04T16:34:06Z)
Neural Network Pruning as Spectrum Preserving Process [7.386663473785839]
We identify the close connection between matrix spectrum learning and neural network training for dense and convolutional layers. We propose a matrix sparsification algorithm tailored for neural network pruning that yields better pruning result.
arXiv Detail & Related papers (2023-07-18T05:39:32Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights. We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z)
A Derivation of Feedforward Neural Network Gradients Using Fr\'echet Calculus [0.0]
We show a derivation of the gradients of feedforward neural networks using Fr'teche calculus. We show how our analysis generalizes to more general neural network architectures including, but not limited to, convolutional networks.
arXiv Detail & Related papers (2022-09-27T08:14:00Z)
Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency. Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z)
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK) In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z)
Neuron-based Pruning of Deep Neural Networks with Better Generalization using Kronecker Factored Curvature Approximation [18.224344440110862]
The proposed algorithm directs the parameters of the compressed model toward a flatter solution by exploring the spectral radius of Hessian. Our result shows that it improves the state-of-the-art results on neuron compression. The method is able to achieve very small networks with small accuracy across different neural network models.
arXiv Detail & Related papers (2021-11-16T15:55:59Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.