Neural Architecture Search of Deep Priors: Towards Continual Learning
without Catastrophic Interference
- URL: http://arxiv.org/abs/2104.06788v1
- Date: Wed, 14 Apr 2021 11:25:30 GMT
- Title: Neural Architecture Search of Deep Priors: Towards Continual Learning
without Catastrophic Interference
- Authors: Martin Mundt, Iuliia Pliushch, Visvanathan Ramesh
- Abstract summary: We show that it is possible to find random weight architectures, a deep prior, that enables a linear classification to perform on par with fully trained deep counterparts.
In an extension to continual learning, we investigate the possibility of catastrophic interference free incremental learning.
- Score: 2.922007656878633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we analyze the classification performance of neural network
structures without parametric inference. Making use of neural architecture
search, we empirically demonstrate that it is possible to find random weight
architectures, a deep prior, that enables a linear classification to perform on
par with fully trained deep counterparts. Through ablation experiments, we
exclude the possibility of winning a weight initialization lottery and confirm
that suitable deep priors do not require additional inference. In an extension
to continual learning, we investigate the possibility of catastrophic
interference free incremental learning. Under the assumption of classes
originating from the same data distribution, a deep prior found on only a
subset of classes is shown to allow discrimination of further classes through
training of a simple linear classifier.
Related papers
- Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - How does unlabeled data improve generalization in self-training? A
one-hidden-layer theoretical analysis [93.37576644429578]
This work establishes the first theoretical analysis for the known iterative self-training paradigm.
We prove the benefits of unlabeled data in both training convergence and generalization ability.
Experiments from shallow neural networks to deep neural networks are also provided to justify the correctness of our established theoretical insights on self-training.
arXiv Detail & Related papers (2022-01-21T02:16:52Z) - Deep Learning with Nonsmooth Objectives [0.0]
We explore the potential for using a nonsmooth loss function based on the max-norm in the training of an artificial neural network.
We hypothesise that this may lead to superior classification results in some special cases where the training data is either very small or unbalanced.
arXiv Detail & Related papers (2021-07-14T02:01:53Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Identifying Learning Rules From Neural Network Observables [26.96375335939315]
We show that different classes of learning rules can be separated solely on the basis of aggregate statistics of the weights, activations, or instantaneous layer-wise activity changes.
Our results suggest that activation patterns, available from electrophysiological recordings of post-synaptic activities, may provide a good basis on which to identify learning rules.
arXiv Detail & Related papers (2020-10-22T14:36:54Z) - Theoretical Analysis of Self-Training with Deep Networks on Unlabeled
Data [48.4779912667317]
Self-training algorithms have been very successful for learning with unlabeled data using neural networks.
This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning.
arXiv Detail & Related papers (2020-10-07T19:43:55Z) - An analytic theory of shallow networks dynamics for hinge loss
classification [14.323962459195771]
We study the training dynamics of a simple type of neural network: a single hidden layer trained to perform a classification task.
We specialize our theory to the prototypical case of a linearly separable dataset and a linear hinge loss.
This allow us to address in a simple setting several phenomena appearing in modern networks such as slowing down of training dynamics, crossover between rich and lazy learning, and overfitting.
arXiv Detail & Related papers (2020-06-19T16:25:29Z) - Uncovering Coresets for Classification With Multi-Objective Evolutionary
Algorithms [0.8057006406834467]
A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data.
A novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples.
A multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error.
arXiv Detail & Related papers (2020-02-20T09:59:56Z) - Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights.
Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.