Learning credit assignment
- URL: http://arxiv.org/abs/2001.03354v2
- Date: Sat, 3 Oct 2020 09:35:22 GMT
- Title: Learning credit assignment
- Authors: Chan Li and Haiping Huang
- Abstract summary: It is unknown how the learning coordinates a huge number of parameters to achieve a decision making.
We propose a mean-field learning model by assuming that an ensemble of sub-networks are trained for a classification task.
Our model learns the credit assignment leading to the decision, and predicts an ensemble of sub-networks that can accomplish the same task.
- Score: 2.0711789781518752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has achieved impressive prediction accuracies in a variety of
scientific and industrial domains. However, the nested non-linear feature of
deep learning makes the learning highly non-transparent, i.e., it is still
unknown how the learning coordinates a huge number of parameters to achieve a
decision making. To explain this hierarchical credit assignment, we propose a
mean-field learning model by assuming that an ensemble of sub-networks, rather
than a single network, are trained for a classification task. Surprisingly, our
model reveals that apart from some deterministic synaptic weights connecting
two neurons at neighboring layers, there exist a large number of connections
that can be absent, and other connections can allow for a broad distribution of
their weight values. Therefore, synaptic connections can be classified into
three categories: very important ones, unimportant ones, and those of
variability that may partially encode nuisance factors. Therefore, our model
learns the credit assignment leading to the decision, and predicts an ensemble
of sub-networks that can accomplish the same task, thereby providing insights
toward understanding the macroscopic behavior of deep learning through the lens
of distinct roles of synaptic weights.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Memorization with neural nets: going beyond the worst case [5.662924503089369]
In practice, deep neural networks are often able to easily interpolate their training data.
For real-world data, however, one intuitively expects the presence of a benign structure so that already occurs at a smaller network size than suggested by memorization capacity.
We introduce a simple randomized algorithm that, given a fixed finite dataset with two classes, with high probability constructs an interpolating three-layer neural network in time.
arXiv Detail & Related papers (2023-09-30T10:06:05Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image
Classification [49.87503122462432]
We introduce a novel neural network termed Relation-and-Margin learning Network (ReMarNet)
Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms.
Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples.
arXiv Detail & Related papers (2020-06-27T13:50:20Z) - An analytic theory of shallow networks dynamics for hinge loss
classification [14.323962459195771]
We study the training dynamics of a simple type of neural network: a single hidden layer trained to perform a classification task.
We specialize our theory to the prototypical case of a linearly separable dataset and a linear hinge loss.
This allow us to address in a simple setting several phenomena appearing in modern networks such as slowing down of training dynamics, crossover between rich and lazy learning, and overfitting.
arXiv Detail & Related papers (2020-06-19T16:25:29Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.