Generalizing Outside the Training Set: When Can Neural Networks Learn
Identity Effects?
- URL: http://arxiv.org/abs/2005.04330v1
- Date: Sat, 9 May 2020 01:08:07 GMT
- Title: Generalizing Outside the Training Set: When Can Neural Networks Learn
Identity Effects?
- Authors: Simone Brugiapaglia, Matthew Liu, Paul Tupper
- Abstract summary: We show that a class of algorithms including deep neural networks with standard architecture and training with backpropagation can generalize to novel inputs.
We demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.
- Score: 1.2891210250935143
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Often in language and other areas of cognition, whether two components of an
object are identical or not determine whether it is well formed. We call such
constraints identity effects. When developing a system to learn well-formedness
from examples, it is easy enough to build in an identify effect. But can
identity effects be learned from the data without explicit guidance? We provide
a simple framework in which we can rigorously prove that algorithms satisfying
simple criteria cannot make the correct inference. We then show that a broad
class of algorithms including deep neural networks with standard architecture
and training with backpropagation satisfy our criteria, dependent on the
encoding of inputs. Finally, we demonstrate our theory with computational
experiments in which we explore the effect of different input encodings on the
ability of algorithms to generalize to novel inputs.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Architecture of a Cortex Inspired Hierarchical Event Recaller [0.0]
This paper proposes a new approach to Machine Learning (ML) that focuses on unsupervised continuous context-dependent learning of complex patterns.
A synthetic structure capable of identifying and predicting complex temporal series will be defined and experimentally tested.
As a proof of concept, the proposed system is shown to be able to learn, identify and predict a remarkably complex temporal series such as human speech, with no prior knowledge.
arXiv Detail & Related papers (2024-05-03T09:36:16Z) - Training Neural Networks with Internal State, Unconstrained
Connectivity, and Discrete Activations [66.53734987585244]
True intelligence may require the ability of a machine learning model to manage internal state.
We show that we have not yet discovered the most effective algorithms for training such models.
We present one attempt to design such a training algorithm, applied to an architecture with binary activations and only a single matrix of weights.
arXiv Detail & Related papers (2023-12-22T01:19:08Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - The brain as a probabilistic transducer: an evolutionarily plausible
network architecture for knowledge representation, computation, and behavior [14.505867475659274]
We offer a general theoretical framework for brain and behavior that is evolutionarily and computationally plausible.
The brain in our abstract model is a network of nodes and edges. Both nodes and edges in our network have weights and activation levels.
By specifying the innate (genetic) components of the network, we show how evolution could endow the network with initial adaptive rules and goals that are then enriched through learning.
arXiv Detail & Related papers (2021-12-26T14:37:47Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Compositional Processing Emerges in Neural Networks Solving Math
Problems [100.80518350845668]
Recent progress in artificial neural networks has shown that when large models are trained on enough linguistic data, grammatical structure emerges in their representations.
We extend this work to the domain of mathematical reasoning, where it is possible to formulate precise hypotheses about how meanings should be composed.
Our work shows that neural networks are not only able to infer something about the structured relationships implicit in their training data, but can also deploy this knowledge to guide the composition of individual meanings into composite wholes.
arXiv Detail & Related papers (2021-05-19T07:24:42Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Invariance, encodings, and generalization: learning identity effects
with neural networks [0.0]
We provide a framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference.
We then show that a broad class of learning algorithms including deep feedforward neural networks trained via gradient-based algorithms satisfy our criteria.
In some broader circumstances we are able to provide adversarial examples that the network necessarily classifies incorrectly.
arXiv Detail & Related papers (2021-01-21T01:28:15Z) - Embedding and Extraction of Knowledge in Tree Ensemble Classifiers [11.762762974386684]
This paper studies the embedding and extraction of knowledge in tree ensemble classifiers.
We propose two novel, and effective, embedding algorithms, one of which is for black-box settings and the other for white-box settings.
We develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver.
arXiv Detail & Related papers (2020-10-16T10:09:01Z) - Exploiting Contextual Information with Deep Neural Networks [5.787117733071416]
We show that contextual information can be exploited in 2 fundamentally different ways: implicitly and explicitly.
In this thesis, we show that contextual information can be exploited in 2 fundamentally different ways: implicitly and explicitly.
arXiv Detail & Related papers (2020-06-21T03:40:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.