The emergence of a concept in shallow neural networks
- URL: http://arxiv.org/abs/2109.00454v1
- Date: Wed, 1 Sep 2021 15:56:38 GMT
- Title: The emergence of a concept in shallow neural networks
- Authors: Elena Agliari, Francesco Alemanno, Adriano Barra, Giordano De Marzo
- Abstract summary: We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable archetypes''
We show that there exists a critical sample size beyond which the RBM can learn archetypes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider restricted Boltzmann machine (RBMs) trained over an unstructured
dataset made of blurred copies of definite but unavailable ``archetypes'' and
we show that there exists a critical sample size beyond which the RBM can learn
archetypes, namely the machine can successfully play as a generative model or
as a classifier, according to the operational routine. In general, assessing a
critical sample size (possibly in relation to the quality of the dataset) is
still an open problem in machine learning. Here, restricting to the random
theory, where shallow networks suffice and the grand-mother cell scenario is
correct, we leverage the formal equivalence between RBMs and Hopfield networks,
to obtain a phase diagram for both the neural architectures which highlights
regions, in the space of the control parameters (i.e., number of archetypes,
number of neurons, size and quality of the training set), where learning can be
accomplished. Our investigations are led by analytical methods based on the
statistical-mechanics of disordered systems and results are further
corroborated by extensive Monte Carlo simulations.
Related papers
- Fundamental limits of overparametrized shallow neural networks for
supervised learning [11.136777922498355]
We study a two-layer neural network trained from input-output pairs generated by a teacher network with matching architecture.
Our results come in the form of bounds relating i) the mutual information between training data and network weights, or ii) the Bayes-optimal generalization error.
arXiv Detail & Related papers (2023-07-11T08:30:50Z) - Dense Hebbian neural networks: a replica symmetric picture of supervised
learning [4.133728123207142]
We consider dense, associative neural-networks trained by a teacher with supervision.
We investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations.
arXiv Detail & Related papers (2022-11-25T13:37:47Z) - Dense Hebbian neural networks: a replica symmetric picture of
unsupervised learning [4.133728123207142]
We consider dense, associative neural-networks trained with no supervision.
We investigate their computational capabilities analytically, via a statistical-mechanics approach, and numerically, via Monte Carlo simulations.
arXiv Detail & Related papers (2022-11-25T12:40:06Z) - A didactic approach to quantum machine learning with a single qubit [68.8204255655161]
We focus on the case of learning with a single qubit, using data re-uploading techniques.
We implement the different proposed formulations in toy and real-world datasets using the qiskit quantum computing SDK.
arXiv Detail & Related papers (2022-11-23T18:25:32Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - MLDS: A Dataset for Weight-Space Analysis of Neural Networks [0.0]
We present MLDS, a new dataset consisting of thousands of trained neural networks with carefully controlled parameters.
This dataset enables new insights into both model-to-model and model-to-training-data relationships.
arXiv Detail & Related papers (2021-04-21T14:24:26Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Rare-Event Simulation for Neural Network and Random Forest Predictors [16.701364984106092]
We study rare-event simulation for a class of problems where the target hitting sets of interest are defined via modern machine learning tools.
This problem is motivated from fast emerging studies on the safety evaluation of intelligent systems.
arXiv Detail & Related papers (2020-10-10T03:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.