Deep transfer learning for system identification using long short-term
memory neural networks
- URL: http://arxiv.org/abs/2204.03125v1
- Date: Wed, 6 Apr 2022 23:39:06 GMT
- Title: Deep transfer learning for system identification using long short-term
memory neural networks
- Authors: Kaicheng Niu, Mi Zhou, Chaouki T. Abdallah, Mohammad Hayajneh
- Abstract summary: This paper proposes using two types of deep transfer learning, namely parameter fine-tuning and freezing, to reduce the data and computation requirements for system identification.
Results show that compared with direct learning, our method accelerates learning by 10% to 50%, which also saves data and computing resources.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent neural networks (RNNs) have many advantages over more traditional
system identification techniques. They may be applied to linear and nonlinear
systems, and they require fewer modeling assumptions. However, these neural
network models may also need larger amounts of data to learn and generalize.
Furthermore, neural networks training is a time-consuming process. Hence,
building upon long-short term memory neural networks (LSTM), this paper
proposes using two types of deep transfer learning, namely parameter
fine-tuning and freezing, to reduce the data and computation requirements for
system identification. We apply these techniques to identify two dynamical
systems, namely a second-order linear system and a Wiener-Hammerstein nonlinear
system. Results show that compared with direct learning, our method accelerates
learning by 10% to 50%, which also saves data and computing resources.
Related papers
- Systematic construction of continuous-time neural networks for linear dynamical systems [0.0]
We discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems.
We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE)
Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system.
arXiv Detail & Related papers (2024-03-24T16:16:41Z) - GreenLightningAI: An Efficient AI System with Decoupled Structural and
Quantitative Knowledge [0.0]
Training powerful and popular deep neural networks comes at very high economic and environmental costs.
This work takes a radically different approach by proposing GreenLightningAI.
The new AI system stores the information required to select the system subset for a given sample.
We show experimentally that the structural information can be kept unmodified when re-training the AI system with new samples.
arXiv Detail & Related papers (2023-12-15T17:34:11Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Learn Like The Pro: Norms from Theory to Size Neural Computation [3.848947060636351]
We investigate how dynamical systems with nonlinearities can inform the design of neural systems that seek to emulate them.
We propose a Learnability metric and quantify its associated features to the near-equilibrium behavior of learning dynamics.
It reveals exact sizing for a class of neural networks with multiplicative nodes that mimic continuous- or discrete-time dynamics.
arXiv Detail & Related papers (2021-06-21T20:58:27Z) - A novel Deep Neural Network architecture for non-linear system
identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification.
Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function)
This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - Physical deep learning based on optimal control of dynamical systems [0.0]
In this study, we perform pattern recognition based on the optimal control of continuous-time dynamical systems.
As a key example, we apply the dynamics-based recognition approach to an optoelectronic delay system.
This is in contrast to conventional multilayer neural networks, which require a large number of weight parameters to be trained.
arXiv Detail & Related papers (2020-12-16T06:38:01Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - On transfer learning of neural networks using bi-fidelity data for
uncertainty propagation [0.0]
We explore the application of transfer learning techniques using training data generated from both high- and low-fidelity models.
In the former approach, a neural network model mapping the inputs to the outputs of interest is trained based on the low-fidelity data.
The high-fidelity data is then used to adapt the parameters of the upper layer(s) of the low-fidelity network, or train a simpler neural network to map the output of the low-fidelity network to that of the high-fidelity model.
arXiv Detail & Related papers (2020-02-11T15:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.