SoftHebb: Bayesian inference in unsupervised Hebbian soft
winner-take-all networks
- URL: http://arxiv.org/abs/2107.05747v1
- Date: Mon, 12 Jul 2021 21:34:45 GMT
- Title: SoftHebb: Bayesian inference in unsupervised Hebbian soft
winner-take-all networks
- Authors: Timoleon Moraitis, Dmitry Toichkin, Yansong Chua, Qinghai Guo
- Abstract summary: Hebbian learning in winner-take-all (WTA) networks is unsupervised, feed-forward, and biologically plausible.
We derive formally such a theory, based on biologically plausible but generic ANN elements.
We confirm our theory in practice, where, in handwritten digit (MNIST) recognition, our Hebbian algorithm, SoftHebb, minimizes cross-entropy without having access to it.
- Score: 1.125159338610819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art artificial neural networks (ANNs) require labelled data or
feedback between layers, are often biologically implausible, and are vulnerable
to adversarial attacks that humans are not susceptible to. On the other hand,
Hebbian learning in winner-take-all (WTA) networks, is unsupervised,
feed-forward, and biologically plausible. However, an objective optimization
theory for WTA networks has been missing, except under very limiting
assumptions. Here we derive formally such a theory, based on biologically
plausible but generic ANN elements. Through Hebbian learning, network
parameters maintain a Bayesian generative model of the data. There is no
supervisory loss function, but the network does minimize cross-entropy between
its activations and the input distribution. The key is a "soft" WTA where there
is no absolute "hard" winner neuron, and a specific type of Hebbian-like
plasticity of weights and biases. We confirm our theory in practice, where, in
handwritten digit (MNIST) recognition, our Hebbian algorithm, SoftHebb,
minimizes cross-entropy without having access to it, and outperforms the more
frequently used, hard-WTA-based method. Strikingly, it even outperforms
supervised end-to-end backpropagation, under certain conditions. Specifically,
in a two-layered network, SoftHebb outperforms backpropagation when the
training dataset is only presented once, when the testing data is noisy, and
under gradient-based adversarial attacks. Adversarial attacks that confuse
SoftHebb are also confusing to the human eye. Finally, the model can generate
interpolations of objects from its input distribution.
Related papers
- Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training.
It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby.
It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Look Beyond Bias with Entropic Adversarial Data Augmentation [4.893694715581673]
Deep neural networks do not discriminate between spurious and causal patterns, and will only learn the most predictive ones while ignoring the others.
Debiasing methods were developed to make networks robust to such spurious biases but require to know in advance if a dataset is biased.
In this paper, we argue that such samples should not be necessarily needed because the ''hidden'' causal information is often also contained in biased images.
arXiv Detail & Related papers (2023-01-10T08:25:24Z) - Unfolding Local Growth Rate Estimates for (Almost) Perfect Adversarial
Detection [22.99930028876662]
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks.
Current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system.
We propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks.
arXiv Detail & Related papers (2022-12-13T17:51:32Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive
Compression [40.35734017517066]
Nested networks or slimmable networks are neural networks whose architectures can be adjusted instantly during testing time.
Recent studies have focused on a "nested dropout" layer, which is able to order the nodes of a layer by importance during training.
arXiv Detail & Related papers (2021-01-27T12:34:58Z) - How Powerful are Shallow Neural Networks with Bandlimited Random
Weights? [25.102870584507244]
We investigate the expressive power of limited depth-2 band random neural networks.
A random net is a neural network where the hidden layer parameters are frozen with random bandwidth.
arXiv Detail & Related papers (2020-08-19T13:26:12Z) - How benign is benign overfitting? [96.07549886487526]
We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models.
Deep neural networks essentially achieve zero training error, even in the presence of label noise.
We identify label noise as one of the causes for adversarial vulnerability.
arXiv Detail & Related papers (2020-07-08T11:07:10Z) - Learning from Failure: Training Debiased Classifier from Biased
Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge.
We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously.
Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z) - Depth-2 Neural Networks Under a Data-Poisoning Attack [2.105564340986074]
We study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup.
In this work, we focus on doing supervised learning for a class of depth-2 finite-width neural networks.
arXiv Detail & Related papers (2020-05-04T17:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.