Exploiting the capacity of deep networks only at training stage for
nonlinear black-box system identification
- URL: http://arxiv.org/abs/2312.15969v2
- Date: Wed, 27 Dec 2023 06:32:29 GMT
- Title: Exploiting the capacity of deep networks only at training stage for
nonlinear black-box system identification
- Authors: Vahid MohammadZadeh Eivaghi, Mahdi Aliyari Shooredeli
- Abstract summary: This study presents a novel training strategy that uses deep models only at the training stage.
The proposed objective function consists of the objective of each student and teacher model adding up with a distance penalty between the learned latent representations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To benefit from the modeling capacity of deep models in system
identification, without worrying about inference time, this study presents a
novel training strategy that uses deep models only at the training stage. For
this purpose two separate models with different structures and goals are
employed. The first one is a deep generative model aiming at modeling the
distribution of system output(s), called the teacher model, and the second one
is a shallow basis function model, named the student model, fed by system
input(s) to predict the system output(s). That means these isolated paths must
reach the same ultimate target. As deep models show a great performance in
modeling of highly nonlinear systems, aligning the representation space learned
by these two models make the student model to inherit the approximation power
of the teacher model. The proposed objective function consists of the objective
of each student and teacher model adding up with a distance penalty between the
learned latent representations. The simulation results on three nonlinear
benchmarks show a comparative performance with examined deep architectures
applied on the same benchmarks. Algorithmic transparency and structure
efficiency are also achieved as byproducts.
Related papers
- Learnable & Interpretable Model Combination in Dynamic Systems Modeling [0.0]
We discuss which types of models are usually combined and propose a model interface that is capable of expressing a variety of mixed equation based models.
We propose a new wildcard topology, that is capable of describing the generic connection between two combined models in an easy to interpret fashion.
The contributions of this paper are highlighted at a proof of concept: Different connection topologies between two models are learned, interpreted and compared.
arXiv Detail & Related papers (2024-06-12T11:17:11Z) - A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework.
It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets.
It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z) - BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - ModelGiF: Gradient Fields for Model Functional Distance [45.183991610710045]
We introduce Model Gradient Field (abbr. ModelGiF) to extract homogeneous representations from pre-trained models.
Our main assumption is that each pre-trained deep model uniquely determines a ModelGiF over the input space.
We validate the effectiveness of the proposed ModelGiF with a suite of testbeds, including task relatedness estimation, intellectual property protection, and model unlearning verification.
arXiv Detail & Related papers (2023-09-20T02:27:40Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - Inter-model Interpretability: Self-supervised Models as a Case Study [0.2578242050187029]
We build on a recent interpretability technique called Dissect to introduce textitinter-model interpretability
We project 13 top-performing self-supervised models into a Learned Concepts Embedding space that reveals proximities among models from the perspective of learned concepts.
The experiment allowed us to categorize the models into three categories and revealed for the first time the type of visual concepts different tasks requires.
arXiv Detail & Related papers (2022-07-24T22:50:18Z) - UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes [91.24112204588353]
We introduce UViM, a unified approach capable of modeling a wide range of computer vision tasks.
In contrast to previous models, UViM has the same functional form for all tasks.
We demonstrate the effectiveness of UViM on three diverse and challenging vision tasks.
arXiv Detail & Related papers (2022-05-20T17:47:59Z) - Learning Dynamics Models for Model Predictive Agents [28.063080817465934]
Model-Based Reinforcement Learning involves learning a textitdynamics model from data, and then using this model to optimise behaviour.
This paper sets out to disambiguate the role of different design choices for learning dynamics models, by comparing their performance to planning with a ground-truth model.
arXiv Detail & Related papers (2021-09-29T09:50:25Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models.
We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.