Investigation and Analysis of Hyper and Hypo neuron pruning to
selectively update neurons during Unsupervised Adaptation
- URL: http://arxiv.org/abs/2001.01755v1
- Date: Mon, 6 Jan 2020 19:46:57 GMT
- Title: Investigation and Analysis of Hyper and Hypo neuron pruning to
selectively update neurons during Unsupervised Adaptation
- Authors: Vikramjit Mitra and Horacio Franco
- Abstract summary: Pruning approaches look for low-salient neurons that are less contributive to a model's decision.
This work investigates if pruning approaches are successful in detecting neurons that are either high-salient (mostly active or hyper) or low-salient (barely active or hypo)
It shows that it may be possible to selectively adapt certain neurons (consisting of the hyper and the hypo neurons) first, followed by a full-network fine tuning.
- Score: 8.845660219190298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unseen or out-of-domain data can seriously degrade the performance of a
neural network model, indicating the model's failure to generalize to unseen
data. Neural net pruning can not only help to reduce a model's size but can
improve the model's generalization capacity as well. Pruning approaches look
for low-salient neurons that are less contributive to a model's decision and
hence can be removed from the model. This work investigates if pruning
approaches are successful in detecting neurons that are either high-salient
(mostly active or hyper) or low-salient (barely active or hypo), and whether
removal of such neurons can help to improve the model's generalization
capacity. Traditional blind adaptation techniques update either the whole or a
subset of layers, but have never explored selectively updating individual
neurons across one or more layers. Focusing on the fully connected layers of a
convolutional neural network (CNN), this work shows that it may be possible to
selectively adapt certain neurons (consisting of the hyper and the hypo
neurons) first, followed by a full-network fine tuning. Using the task of
automatic speech recognition, this work demonstrates how the removal of hyper
and hypo neurons from a model can improve the model's performance on
out-of-domain speech data and how selective neuron adaptation can ensure
improved performance when compared to traditional blind model adaptation.
Related papers
- Fast gradient-free activation maximization for neurons in spiking neural networks [5.805438104063613]
We present a framework with an efficient design for such a loop.
We track changes in the optimal stimuli for artificial neurons during training.
This formation of refined optimal stimuli is associated with an increase in classification accuracy.
arXiv Detail & Related papers (2023-12-28T18:30:13Z) - Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data [3.46029409929709]
State-of-the-art systems neuroscience experiments yield large-scale multimodal data, and these data sets require new tools for analysis.
Inspired by the success of large pretrained models in vision and language domains, we reframe the analysis of large-scale, cellular-resolution neuronal spiking data into an autoregressive generation problem.
We first trained Neuroformer on simulated datasets, and found that it both accurately predicted intrinsically simulated neuronal circuit activity, and also inferred the underlying neural circuit connectivity, including direction.
arXiv Detail & Related papers (2023-10-31T20:17:32Z) - WaLiN-GUI: a graphical and auditory tool for neuron-based encoding [73.88751967207419]
Neuromorphic computing relies on spike-based, energy-efficient communication.
We develop a tool to identify suitable configurations for neuron-based encoding of sample-based data into spike trains.
The WaLiN-GUI is provided open source and with documentation.
arXiv Detail & Related papers (2023-10-25T20:34:08Z) - Neuron to Graph: Interpreting Language Model Neurons at Scale [8.32093320910416]
This paper introduces a novel automated approach designed to scale interpretability techniques across a vast array of neurons within Large Language Models.
We propose Neuron to Graph (N2G), an innovative tool that automatically extracts a neuron's behaviour from the dataset it was trained on and translates it into an interpretable graph.
arXiv Detail & Related papers (2023-05-31T14:44:33Z) - Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
Understanding [82.46024259137823]
We propose a cross-model comparative loss for a broad range of tasks.
We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks.
arXiv Detail & Related papers (2023-01-10T03:04:27Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Improving Spiking Neural Network Accuracy Using Time-based Neurons [0.24366811507669117]
Research on neuromorphic computing systems based on low-power spiking neural networks using analog neurons is in the spotlight.
As technology scales down, analog neurons are difficult to scale, and they suffer from reduced voltage headroom/dynamic range and circuit nonlinearities.
This paper first models the nonlinear behavior of existing current-mirror-based voltage-domain neurons designed in a 28nm process, and show SNN inference accuracy can be severely degraded by the effect of neuron's nonlinearity.
We propose a novel neuron, which processes incoming spikes in the time domain and greatly improves the linearity, thereby improving the inference accuracy compared to the
arXiv Detail & Related papers (2022-01-05T00:24:45Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.