Magnificent Minified Models
- URL: http://arxiv.org/abs/2306.10177v1
- Date: Fri, 16 Jun 2023 21:00:44 GMT
- Title: Magnificent Minified Models
- Authors: Rich Harang and Hillary Sanders
- Abstract summary: This paper concerns itself with the task of taking a large trained neural network and 'compressing' it to be smaller by deleting parameters or entire neurons.
We compare various methods of parameter and neuron selection: dropout-based neuron damage estimation, neuron merging, absolute-value based selection, random selection.
For neuron-level pruning, retraining from scratch did much better in our experiments.
- Score: 0.360953887026184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper concerns itself with the task of taking a large trained neural
network and 'compressing' it to be smaller by deleting parameters or entire
neurons, with minimal decreases in the resulting model accuracy. We compare
various methods of parameter and neuron selection: dropout-based neuron damage
estimation, neuron merging, absolute-value based selection, random selection,
OBD (Optimal Brain Damage). We also compare a variation on the classic OBD
method that slightly outperformed all other parameter and neuron selection
methods in our tests with substantial pruning, which we call OBD-SD. We compare
these methods against quantization of parameters. We also compare these
techniques (all applied to a trained neural network), with neural networks
trained from scratch (random weight initialization) on various pruned
architectures. Our results are only barely consistent with the Lottery Ticket
Hypothesis, in that fine-tuning a parameter-pruned model does slightly better
than retraining a similarly pruned model from scratch with randomly initialized
weights. For neuron-level pruning, retraining from scratch did much better in
our experiments.
Related papers
- Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model [43.107778640669544]
Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles.
Recent studies have revealed that not all neurons are active across different datasets.
We introduce Neuron-Level Fine-Tuning (NeFT), a novel approach that refines the granularity of parameter training down to the individual neuron.
arXiv Detail & Related papers (2024-03-18T09:55:01Z) - Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
Understanding [82.46024259137823]
We propose a cross-model comparative loss for a broad range of tasks.
We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks.
arXiv Detail & Related papers (2023-01-10T03:04:27Z) - Supervised Parameter Estimation of Neuron Populations from Multiple
Firing Events [3.2826301276626273]
We study an automatic approach of learning the parameters of neuron populations from a training set consisting of pairs of spiking series and parameter labels via supervised learning.
We simulate many neuronal populations at computation at different parameter settings using a neuron model.
We then compare their performance against classical approaches including a genetic search, Bayesian sequential estimation, and a random walk approximate model.
arXiv Detail & Related papers (2022-10-02T03:17:05Z) - Neuron-based Pruning of Deep Neural Networks with Better Generalization
using Kronecker Factored Curvature Approximation [18.224344440110862]
The proposed algorithm directs the parameters of the compressed model toward a flatter solution by exploring the spectral radius of Hessian.
Our result shows that it improves the state-of-the-art results on neuron compression.
The method is able to achieve very small networks with small accuracy across different neural network models.
arXiv Detail & Related papers (2021-11-16T15:55:59Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance.
We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z) - Investigation and Analysis of Hyper and Hypo neuron pruning to
selectively update neurons during Unsupervised Adaptation [8.845660219190298]
Pruning approaches look for low-salient neurons that are less contributive to a model's decision.
This work investigates if pruning approaches are successful in detecting neurons that are either high-salient (mostly active or hyper) or low-salient (barely active or hypo)
It shows that it may be possible to selectively adapt certain neurons (consisting of the hyper and the hypo neurons) first, followed by a full-network fine tuning.
arXiv Detail & Related papers (2020-01-06T19:46:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.