Neuron Merging: Compensating for Pruned Neurons
- URL: http://arxiv.org/abs/2010.13160v1
- Date: Sun, 25 Oct 2020 16:50:26 GMT
- Title: Neuron Merging: Compensating for Pruned Neurons
- Authors: Woojeong Kim, Suhyun Kim, Mincheol Park, Geonseok Jeon
- Abstract summary: Structured network pruning discards the whole neuron or filter, leading to accuracy loss.
We propose a novel concept of neuron merging applicable to both fully connected layers and convolution layers.
We achieve an accuracy of 93.16% while reducing 64% of total parameters, without any fine-tuning.
- Score: 3.441021278275805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network pruning is widely used to lighten and accelerate neural network
models. Structured network pruning discards the whole neuron or filter, leading
to accuracy loss. In this work, we propose a novel concept of neuron merging
applicable to both fully connected layers and convolution layers, which
compensates for the information loss due to the pruned neurons/filters. Neuron
merging starts with decomposing the original weights into two matrices/tensors.
One of them becomes the new weights for the current layer, and the other is
what we name a scaling matrix, guiding the combination of neurons. If the
activation function is ReLU, the scaling matrix can be absorbed into the next
layer under certain conditions, compensating for the removed neurons. We also
propose a data-free and inexpensive method to decompose the weights by
utilizing the cosine similarity between neurons. Compared to the pruned model
with the same topology, our merged model better preserves the output feature
map of the original model; thus, it maintains the accuracy after pruning
without fine-tuning. We demonstrate the effectiveness of our approach over
network pruning for various model architectures and datasets. As an example,
for VGG-16 on CIFAR-10, we achieve an accuracy of 93.16% while reducing 64% of
total parameters, without any fine-tuning. The code can be found here:
https://github.com/friendshipkim/neuron-merging
Related papers
- RelChaNet: Neural Network Feature Selection using Relative Change Scores [0.0]
We introduce RelChaNet, a novel and lightweight feature selection algorithm that uses neuron pruning and regrowth in the input layer of a dense neural network.
Our approach generally outperforms the current state-of-the-art methods, and in particular improves the average accuracy by 2% on the MNIST dataset.
arXiv Detail & Related papers (2024-10-03T09:56:39Z) - Magnificent Minified Models [0.360953887026184]
This paper concerns itself with the task of taking a large trained neural network and 'compressing' it to be smaller by deleting parameters or entire neurons.
We compare various methods of parameter and neuron selection: dropout-based neuron damage estimation, neuron merging, absolute-value based selection, random selection.
For neuron-level pruning, retraining from scratch did much better in our experiments.
arXiv Detail & Related papers (2023-06-16T21:00:44Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life [0.0]
We introduce Synaptic Stripping as a means to combat the dead neuron problem.
By automatically removing problematic connections during training, we can regenerate dead neurons.
We conduct several ablation studies to investigate these dynamics as a function of network width and depth.
arXiv Detail & Related papers (2023-02-11T23:55:50Z) - Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
Understanding [82.46024259137823]
We propose a cross-model comparative loss for a broad range of tasks.
We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks.
arXiv Detail & Related papers (2023-01-10T03:04:27Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Receding Neuron Importances for Structured Pruning [11.375436522599133]
Structured pruning efficiently compresses networks by identifying and removing unimportant neurons.
We introduce a simple BatchNorm variation with bounded scaling parameters, based on which we design a novel regularisation term that suppresses only neurons with low importance.
We show that neural networks trained this way can be pruned to a larger extent and with less deterioration.
arXiv Detail & Related papers (2022-04-13T14:08:27Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.