Related papers: Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks through Network-Aware Adaptation

Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks through Network-Aware Adaptation

URL: http://arxiv.org/abs/2006.02655v1
Date: Thu, 4 Jun 2020 06:07:30 GMT
Title: Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks through Network-Aware Adaptation
Authors: AbdElRahman ElSaid, Joshua Karns, Alexander Ororbia II, Daniel Krutz, Zimeng Lyu, Travis Desell
Abstract summary: This work introduces network-aware adaptive structure transfer learning (N-ASTL) N-ASTL utilizes statistical information related to the source network's topology and weight distribution to inform how new input and output neurons are to be integrated into the existing structure. Results show improvements over prior state-of-the-art, including the ability to transfer in challenging real-world datasets not previously possible.
Score: 57.46377517266827
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transfer learning entails taking an artificial neural network (ANN) that is trained on a source dataset and adapting it to a new target dataset. While this has been shown to be quite powerful, its use has generally been restricted by architectural constraints. Previously, in order to reuse and adapt an ANN's internal weights and structure, the underlying topology of the ANN being transferred across tasks must remain mostly the same while a new output layer is attached, discarding the old output layer's weights. This work introduces network-aware adaptive structure transfer learning (N-ASTL), an advancement over prior efforts to remove this restriction. N-ASTL utilizes statistical information related to the source network's topology and weight distribution in order to inform how new input and output neurons are to be integrated into the existing structure. Results show improvements over prior state-of-the-art, including the ability to transfer in challenging real-world datasets not previously possible and improved generalization over RNNs trained without transfer.

Related papers

ALADE-SNN: Adaptive Logit Alignment in Dynamically Expandable Spiking Neural Networks for Class Incremental Learning [15.022211557367273]
We develop spiking neural networks (SNNs) with dynamic structures for Class Incremental Learning (CIL) We propose the ALADE-SNN framework, which includes adaptive logit alignment for balanced feature representation and OtoN suppression to manage weights mapping frozen old features to new classes during training. Experiment results show that ALADE-SNN achieves an average incremental accuracy of 75.42 on the CIFAR100-B0 benchmark over 10 incremental steps.
arXiv Detail & Related papers (2024-12-17T09:13:22Z)
Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains [23.10912424714101]
Recently discovered Neural collapse (NC) phenomenon states that the last-layer weights of Deep Neural Networks converge to the so-called Equiangular Tight Frame (ETF) simplex, at the terminal phase of their training. Inspired by NC properties, we explore in this paper the transferability of DNN models trained with their last layer weight fixed according to ETF.
arXiv Detail & Related papers (2024-02-28T15:52:30Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Desire Backpropagation: A Lightweight Training Algorithm for Multi-Layer Spiking Neural Networks based on Spike-Timing-Dependent Plasticity [13.384228628766236]
Spiking neural networks (SNNs) are a viable alternative to conventional artificial neural networks. We present desire backpropagation, a method to derive the desired spike activity of all neurons, including the hidden ones. We trained three-layer networks to classify MNIST and Fashion-MNIST images and reached an accuracy of 98.41% and 87.56%, respectively.
arXiv Detail & Related papers (2022-11-10T08:32:13Z)
Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks. We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs) At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z)
Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Renormalization Group [4.56877715768796]
We study the statistical mechanics of learning in Deep Linear Neural Networks (DLNNs) in which the input-output function of an individual unit is linear. We solve exactly the network properties following supervised learning using an equilibrium Gibbs distribution in the weight space. Our numerical simulations reveal that despite the nonlinearity, the predictions of our theory are largely shared by ReLU networks with modest depth.
arXiv Detail & Related papers (2020-12-07T20:08:31Z)
Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data. The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Deep Transfer Learning with Ridge Regression [7.843067454030999]
Deep models trained with massive amounts of data demonstrate promising generalisation ability on unseen data from relevant domains. We address this issue by leveraging the low-rank property of learnt feature vectors produced from deep neural networks (DNNs) with the closed-form solution provided in kernel ridge regression (KRR) Our method is successful on supervised and semi-supervised transfer learning tasks.
arXiv Detail & Related papers (2020-06-11T20:21:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.