To update or not to update? Neurons at equilibrium in deep models
- URL: http://arxiv.org/abs/2207.09455v1
- Date: Tue, 19 Jul 2022 08:07:53 GMT
- Title: To update or not to update? Neurons at equilibrium in deep models
- Authors: Andrea Bragagnolo and Enzo Tartaglione and Marco Grangetto
- Abstract summary: Recent advances in deep learning showed that, with some a-posteriori information on fully-trained models, it is possible to match the same performance by simply training a subset of their parameters.
In this work we shift our focus from the single parameters to the behavior of the whole neuron, exploiting the concept of neuronal equilibrium (NEq)
The proposed approach has been tested on different state-of-the-art learning strategies and tasks, validating NEq and observing that the neuronal equilibrium depends on the specific learning setup.
- Score: 8.72305226979945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in deep learning optimization showed that, with some
a-posteriori information on fully-trained models, it is possible to match the
same performance by simply training a subset of their parameters. Such a
discovery has a broad impact from theory to applications, driving the research
towards methods to identify the minimum subset of parameters to train without
look-ahead information exploitation. However, the methods proposed do not match
the state-of-the-art performance, and rely on unstructured sparsely connected
models. In this work we shift our focus from the single parameters to the
behavior of the whole neuron, exploiting the concept of neuronal equilibrium
(NEq). When a neuron is in a configuration at equilibrium (meaning that it has
learned a specific input-output relationship), we can halt its update; on the
contrary, when a neuron is at non-equilibrium, we let its state evolve towards
an equilibrium state, updating its parameters. The proposed approach has been
tested on different state-of-the-art learning strategies and tasks, validating
NEq and observing that the neuronal equilibrium depends on the specific
learning setup.
Related papers
- Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model [43.107778640669544]
Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles.
Recent studies have revealed that not all neurons are active across different datasets.
We introduce Neuron-Level Fine-Tuning (NeFT), a novel approach that refines the granularity of parameter training down to the individual neuron.
arXiv Detail & Related papers (2024-03-18T09:55:01Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Supervised Parameter Estimation of Neuron Populations from Multiple
Firing Events [3.2826301276626273]
We study an automatic approach of learning the parameters of neuron populations from a training set consisting of pairs of spiking series and parameter labels via supervised learning.
We simulate many neuronal populations at computation at different parameter settings using a neuron model.
We then compare their performance against classical approaches including a genetic search, Bayesian sequential estimation, and a random walk approximate model.
arXiv Detail & Related papers (2022-10-02T03:17:05Z) - Modeling Implicit Bias with Fuzzy Cognitive Maps [0.0]
This paper presents a Fuzzy Cognitive Map model to quantify implicit bias in structured datasets.
We introduce a new reasoning mechanism equipped with a normalization-like transfer function that prevents neurons from saturating.
arXiv Detail & Related papers (2021-12-23T17:04:12Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - On the Evolution of Neuron Communities in a Deep Learning Architecture [0.7106986689736827]
This paper examines the neuron activation patterns of deep learning-based classification models.
We show that both the community quality (modularity) and entropy are closely related to the deep learning models' performances.
arXiv Detail & Related papers (2021-06-08T21:09:55Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Spherical Motion Dynamics: Learning Dynamics of Neural Network with
Normalization, Weight Decay, and SGD [105.99301967452334]
We show the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum) named as Spherical Motion Dynamics (SMD)
We verify our assumptions and theoretical results on various computer vision tasks including ImageNet and MSCOCO with standard settings.
arXiv Detail & Related papers (2020-06-15T14:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.