Related papers: Generalizing Outside the Training Set: When Can Neural Networks Learn Identity Effects?

Generalizing Outside the Training Set: When Can Neural Networks Learn Identity Effects?

URL: http://arxiv.org/abs/2005.04330v1
Date: Sat, 9 May 2020 01:08:07 GMT
Title: Generalizing Outside the Training Set: When Can Neural Networks Learn Identity Effects?
Authors: Simone Brugiapaglia, Matthew Liu, Paul Tupper
Abstract summary: We show that a class of algorithms including deep neural networks with standard architecture and training with backpropagation can generalize to novel inputs. We demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.
Score: 1.2891210250935143
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Often in language and other areas of cognition, whether two components of an object are identical or not determine whether it is well formed. We call such constraints identity effects. When developing a system to learn well-formedness from examples, it is easy enough to build in an identify effect. But can identity effects be learned from the data without explicit guidance? We provide a simple framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of algorithms including deep neural networks with standard architecture and training with backpropagation satisfy our criteria, dependent on the encoding of inputs. Finally, we demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.

Related papers

Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers. It is common to instead use proxy tasks that are similar in only an informal sense. We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
Architecture of a Cortex Inspired Hierarchical Event Recaller [0.0]
This paper proposes a new approach to Machine Learning (ML) that focuses on unsupervised continuous context-dependent learning of complex patterns. A synthetic structure capable of identifying and predicting complex temporal series will be defined and experimentally tested. As a proof of concept, the proposed system is shown to be able to learn, identify and predict a remarkably complex temporal series such as human speech, with no prior knowledge.
arXiv Detail & Related papers (2024-05-03T09:36:16Z)
Training Neural Networks with Internal State, Unconstrained Connectivity, and Discrete Activations [66.53734987585244]
True intelligence may require the ability of a machine learning model to manage internal state. We show that we have not yet discovered the most effective algorithms for training such models. We present one attempt to design such a training algorithm, applied to an architecture with binary activations and only a single matrix of weights.
arXiv Detail & Related papers (2023-12-22T01:19:08Z)
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex. We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z)
The brain as a probabilistic transducer: an evolutionarily plausible network architecture for knowledge representation, computation, and behavior [14.505867475659274]
We offer a general theoretical framework for brain and behavior that is evolutionarily and computationally plausible. The brain in our abstract model is a network of nodes and edges. Both nodes and edges in our network have weights and activation levels. By specifying the innate (genetic) components of the network, we show how evolution could endow the network with initial adaptive rules and goals that are then enriched through learning.
arXiv Detail & Related papers (2021-12-26T14:37:47Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Invariance, encodings, and generalization: learning identity effects with neural networks [0.0]
We provide a framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of learning algorithms including deep feedforward neural networks trained via gradient-based algorithms satisfy our criteria. In some broader circumstances we are able to provide adversarial examples that the network necessarily classifies incorrectly.
arXiv Detail & Related papers (2021-01-21T01:28:15Z)
Embedding and Extraction of Knowledge in Tree Ensemble Classifiers [11.762762974386684]
This paper studies the embedding and extraction of knowledge in tree ensemble classifiers. We propose two novel, and effective, embedding algorithms, one of which is for black-box settings and the other for white-box settings. We develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver.
arXiv Detail & Related papers (2020-10-16T10:09:01Z)
Exploiting Contextual Information with Deep Neural Networks [5.787117733071416]
We show that contextual information can be exploited in 2 fundamentally different ways: implicitly and explicitly. In this thesis, we show that contextual information can be exploited in 2 fundamentally different ways: implicitly and explicitly.
arXiv Detail & Related papers (2020-06-21T03:40:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.