Invariance, encodings, and generalization: learning identity effects
with neural networks
- URL: http://arxiv.org/abs/2101.08386v2
- Date: Fri, 26 Feb 2021 19:55:16 GMT
- Title: Invariance, encodings, and generalization: learning identity effects
with neural networks
- Authors: S. Brugiapaglia, M. Liu, P. Tupper
- Abstract summary: We provide a framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference.
We then show that a broad class of learning algorithms including deep feedforward neural networks trained via gradient-based algorithms satisfy our criteria.
In some broader circumstances we are able to provide adversarial examples that the network necessarily classifies incorrectly.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Often in language and other areas of cognition, whether two components of an
object are identical or not determines if it is well formed. We call such
constraints identity effects. When developing a system to learn well-formedness
from examples, it is easy enough to build in an identify effect. But can
identity effects be learned from the data without explicit guidance? We provide
a framework in which we can rigorously prove that algorithms satisfying simple
criteria cannot make the correct inference. We then show that a broad class of
learning algorithms including deep feedforward neural networks trained via
gradient-based algorithms (such as stochastic gradient descent or the Adam
method) satisfy our criteria, dependent on the encoding of inputs. In some
broader circumstances we are able to provide adversarial examples that the
network necessarily classifies incorrectly. Finally, we demonstrate our theory
with computational experiments in which we explore the effect of different
input encodings on the ability of algorithms to generalize to novel inputs.
Related papers
- Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Neurosymbolic hybrid approach to driver collision warning [64.02492460600905]
There are two main algorithmic approaches to autonomous driving systems.
Deep learning alone has achieved state-of-the-art results in many areas.
But sometimes it can be very difficult to debug if the deep learning model doesn't work.
arXiv Detail & Related papers (2022-03-28T20:29:50Z) - The brain as a probabilistic transducer: an evolutionarily plausible
network architecture for knowledge representation, computation, and behavior [14.505867475659274]
We offer a general theoretical framework for brain and behavior that is evolutionarily and computationally plausible.
The brain in our abstract model is a network of nodes and edges. Both nodes and edges in our network have weights and activation levels.
By specifying the innate (genetic) components of the network, we show how evolution could endow the network with initial adaptive rules and goals that are then enriched through learning.
arXiv Detail & Related papers (2021-12-26T14:37:47Z) - A Bayesian Approach to (Online) Transfer Learning: Theory and Algorithms [6.193838300896449]
We study transfer learning from a Bayesian perspective, where a parametric statistical model is used.
Specifically, we study three variants of transfer learning problems, instantaneous, online, and time-variant transfer learning.
For each problem, we define an appropriate objective function, and provide either exact expressions or upper bounds on the learning performance.
Examples show that the derived bounds are accurate even for small sample sizes.
arXiv Detail & Related papers (2021-09-03T08:43:29Z) - Some thoughts on catastrophic forgetting and how to learn an algorithm [0.0]
We propose to use a neural network with a different architecture that can be trained to recover the correct algorithm for the addition of binary numbers.
The neural network not only does not suffer from catastrophic forgetting but it improves its predictive power on unseen pairs of numbers as training progresses.
arXiv Detail & Related papers (2021-08-09T11:12:43Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z) - On the Robustness of Active Learning [0.7340017786387767]
Active Learning is concerned with how to identify the most useful samples for a Machine Learning algorithm to be trained with.
We find that it is often applied with not enough care and domain knowledge.
We propose the new "Sum of Squared Logits" method based on the Simpson diversity index and investigate the effect of using the confusion matrix for balancing in sample selection.
arXiv Detail & Related papers (2020-06-18T09:07:23Z) - Generalizing Outside the Training Set: When Can Neural Networks Learn
Identity Effects? [1.2891210250935143]
We show that a class of algorithms including deep neural networks with standard architecture and training with backpropagation can generalize to novel inputs.
We demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.
arXiv Detail & Related papers (2020-05-09T01:08:07Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.