Related papers: Some thoughts on catastrophic forgetting and how to learn an algorithm

Some thoughts on catastrophic forgetting and how to learn an algorithm

URL: http://arxiv.org/abs/2108.03940v1
Date: Mon, 9 Aug 2021 11:12:43 GMT
Title: Some thoughts on catastrophic forgetting and how to learn an algorithm
Authors: Miguel Ruiz-Garcia
Abstract summary: We propose to use a neural network with a different architecture that can be trained to recover the correct algorithm for the addition of binary numbers. The neural network not only does not suffer from catastrophic forgetting but it improves its predictive power on unseen pairs of numbers as training progresses.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The work of McCloskey and Cohen popularized the concept of catastrophic interference. They used a neural network that tried to learn addition using two groups of examples as two different tasks. In their case, learning the second task rapidly deteriorated the acquired knowledge about the previous one. This could be a symptom of a fundamental problem: addition is an algorithmic task that should not be learned through pattern recognition. We propose to use a neural network with a different architecture that can be trained to recover the correct algorithm for the addition of binary numbers. We test it in the setting proposed by McCloskey and Cohen and training on random examples one by one. The neural network not only does not suffer from catastrophic forgetting but it improves its predictive power on unseen pairs of numbers as training progresses. This work emphasizes the importance that neural network architecture has for the emergence of catastrophic forgetting and introduces a neural network that is able to learn an algorithm.

Related papers

LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks. We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z)
Verified Neural Compressed Sensing [58.98637799432153]
We develop the first (to the best of our knowledge) provably correct neural networks for a precise computational task. We show that for modest problem dimensions (up to 50), we can train neural networks that provably recover a sparse vector from linear and binarized linear measurements. We show that the complexity of the network can be adapted to the problem difficulty and solve problems where traditional compressed sensing methods are not known to provably work.
arXiv Detail & Related papers (2024-05-07T12:20:12Z)
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex. We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z)
Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z)
Refining neural network predictions using background knowledge [68.35246878394702]
We show we can use logical background knowledge in learning system to compensate for a lack of labeled training data. We introduce differentiable refinement functions that find a corrected prediction close to the original prediction. This algorithm finds optimal refinements on complex SAT formulas in significantly fewer iterations and frequently finds solutions where gradient descent can not.
arXiv Detail & Related papers (2022-06-10T10:17:59Z)
Explain to Not Forget: Defending Against Catastrophic Forgetting with XAI [10.374979214803805]
Catastrophic forgetting describes the phenomenon when a neural network completely forgets previous knowledge when given new information. We propose a novel training algorithm called training by explaining in which we leverage Layer-wise Relevance propagation in order to retain the information a neural network has already learned in previous tasks when training on new data. Our method not only successfully retains the knowledge of old tasks within the neural networks but does so more resource-efficiently than other state-of-the-art solutions.
arXiv Detail & Related papers (2022-05-04T08:00:49Z)
Predictive Coding: Towards a Future of Deep Learning beyond Backpropagation? [41.58529335439799]
The backpropagation of error algorithm used to train deep neural networks has been fundamental to the successes of deep learning. Recent work has developed the idea into a general-purpose algorithm able to train neural networks using only local computations. We show the substantially greater flexibility of predictive coding networks against equivalent deep neural networks.
arXiv Detail & Related papers (2022-02-18T22:57:03Z)
The mathematics of adversarial attacks in AI -- Why deep learning is unstable despite the existence of stable neural networks [69.33657875725747]
We prove that any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability. Our result points towards the paradox that accurate and stable neural networks exist, however, modern algorithms do not compute them.
arXiv Detail & Related papers (2021-09-13T16:19:25Z)
Truly Sparse Neural Networks at Scale [2.2860412844991655]
We train the largest neural network ever trained in terms of representational power -- reaching the bat brain size. Our approach has state-of-the-art performance while opening the path for an environmentally friendly artificial intelligence era.
arXiv Detail & Related papers (2021-02-02T20:06:47Z)
A biologically plausible neural network for local supervision in cortical microcircuits [17.00937011213428]
We derive an algorithm for training a neural network which avoids explicit error and backpropagation. Our algorithm maps onto a neural network that bears a remarkable resemblance to the connectivity structure and learning rules of the cortex.
arXiv Detail & Related papers (2020-11-30T17:35:22Z)
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks. ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.