Sparse deep neural networks for modeling aluminum electrolysis dynamics
- URL: http://arxiv.org/abs/2209.05832v1
- Date: Tue, 13 Sep 2022 09:11:50 GMT
- Title: Sparse deep neural networks for modeling aluminum electrolysis dynamics
- Authors: Erlend Torje Berg Lundby, Adil Rasheed, Ivar Johan Halvorsen, Jan
Tommy Gravdahl
- Abstract summary: We train sparse neural networks to model the system dynamics of an aluminum electrolysis simulator.
The sparse model structure has a significantly reduction in model complexity compared to a corresponding dense neural network.
The empirical study shows that the sparse models generalize better from small training sets than dense neural networks.
- Score: 0.5257115841810257
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial neural networks have a broad array of applications today due to
their high degree of flexibility and ability to model nonlinear functions from
data. However, the trustworthiness of neural networks is limited due to their
black-box nature, their poor ability to generalize from small datasets, and
their inconsistent convergence during training. Aluminum electrolysis is a
complex nonlinear process with many interrelated sub-processes. Artificial
neural networks can potentially be well suited for modeling the aluminum
electrolysis process, but the safety-critical nature of this process requires
trustworthy models. In this work, sparse neural networks are trained to model
the system dynamics of an aluminum electrolysis simulator. The sparse model
structure has a significantly reduction in model complexity compared to a
corresponding dense neural network. We argue that this makes the model more
interpretable. Furthermore, the empirical study shows that the sparse models
generalize better from small training sets than dense neural networks.
Moreover, training an ensemble of sparse neural networks with different
parameter initializations show that the models converge to similar model
structures with similar learned input features.
Related papers
- Physics-Informed Neural Networks with Hard Linear Equality Constraints [9.101849365688905]
This work proposes a novel physics-informed neural network, KKT-hPINN, which rigorously guarantees hard linear equality constraints.
Experiments on Aspen models of a stirred-tank reactor unit, an extractive distillation subsystem, and a chemical plant demonstrate that this model can further enhance the prediction accuracy.
arXiv Detail & Related papers (2024-02-11T17:40:26Z) - On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models.
We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Quadratic models for understanding catapult dynamics of neural networks [15.381097076708535]
We show that recently proposed Neural Quadratic Models can exhibit the "catapult phase" that arises when training such models with large learning rates.
Our analysis further demonstrates that quadratic models can be an effective tool for analysis of neural networks.
arXiv Detail & Related papers (2022-05-24T05:03:06Z) - Physics guided neural networks for modelling of non-linear dynamics [0.0]
This work demonstrates that injection of partially known information at an intermediate layer in a deep neural network can improve model accuracy, reduce model uncertainty, and yield improved convergence during the training.
The value of these physics-guided neural networks has been demonstrated by learning the dynamics of a wide variety of nonlinear dynamical systems represented by five well-known equations in nonlinear systems theory.
arXiv Detail & Related papers (2022-05-13T19:06:36Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis.
We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z) - Sobolev training of thermodynamic-informed neural networks for smoothed
elasto-plasticity models with level set hardening [0.0]
We introduce a deep learning framework designed to train smoothed elastoplasticity models with interpretable components.
By recasting the yield function as an evolving level set, we introduce a machine learning approach to predict the solutions of the Hamilton-Jacobi equation.
arXiv Detail & Related papers (2020-10-15T22:43:32Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - Flexible Transmitter Network [84.90891046882213]
Current neural networks are mostly built upon the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons.
We propose the Flexible Transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity.
We present the Flexible Transmitter Network (FTNet), which is built on the most common fully-connected feed-forward architecture.
arXiv Detail & Related papers (2020-04-08T06:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.