Neural auto-association with optimal Bayesian learning
- URL: http://arxiv.org/abs/2412.18349v1
- Date: Tue, 24 Dec 2024 11:22:18 GMT
- Title: Neural auto-association with optimal Bayesian learning
- Authors: Andreas Knoblauch,
- Abstract summary: I study the optimal Bayesian associative network for auto-association where input and output layers are identical.
It turns out that performance can depend on subtle dependencies of input components violating the naive Bayes'' assumptions.
- Score: 0.0
- License:
- Abstract: Neural associative memories are single layer perceptrons with fast synaptic learning typically storing discrete associations between pairs of neural activity patterns. Previous works have analyzed the optimal networks under naive Bayes assumptions of independent pattern components and heteroassociation, where the task is to learn associations from input to output patterns. Here I study the optimal Bayesian associative network for auto-association where input and output layers are identical. In particular, I compare performance to different variants of approximate Bayesian learning rules, like the BCPNN (Bayesian Confidence Propagation Neural Network), and try to explain why sometimes the suboptimal learning rules achieve higher storage capacity than the (theoretically) optimal model. It turns out that performance can depend on subtle dependencies of input components violating the ``naive Bayes'' assumptions. This includes patterns with constant number of active units, iterative retrieval where patterns are repeatedly propagated through recurrent networks, and winners-take-all activation of the most probable units. Performance of all learning rules can improve significantly if they include a novel adaptive mechanism to estimate noise in iterative retrieval steps (ANE). The overall maximum storage capacity is achieved again by the Bayesian learning rule with ANE.
Related papers
- Learning Discretized Bayesian Networks with GOMEA [0.0]
We extend an existing state-of-the-art structure learning approach to jointly learn variable discretizations.
We show how this enables incorporating expert knowledge in a uniquely insightful fashion, finding multiple DBNs that trade-off complexity, accuracy, and the difference with a pre-determined expert network.
arXiv Detail & Related papers (2024-02-19T14:29:35Z) - Benchmarking Hebbian learning rules for associative memory [0.0]
Associative memory is a key concept in cognitive and computational brain science.
We benchmark six different learning rules on storage capacity and prototype extraction.
arXiv Detail & Related papers (2023-12-30T21:49:47Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - Characterizing and overcoming the greedy nature of learning in
multi-modal deep neural networks [62.48782506095565]
We show that due to the greedy nature of learning in deep neural networks, models tend to rely on just one modality while under-fitting the other modalities.
We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning.
arXiv Detail & Related papers (2022-02-10T20:11:21Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Multi-Sample Online Learning for Spiking Neural Networks based on
Generalized Expectation Maximization [42.125394498649015]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains by processing through binary neural dynamic activations.
This paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights.
The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient.
arXiv Detail & Related papers (2021-02-05T16:39:42Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z) - Separation of Memory and Processing in Dual Recurrent Neural Networks [0.0]
We explore a neural network architecture that stacks a recurrent layer and a feedforward layer that is also connected to the input.
When noise is introduced into the activation function of the recurrent units, these neurons are forced into a binary activation regime that makes the networks behave much as finite automata.
arXiv Detail & Related papers (2020-05-17T11:38:42Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.