Related papers: Sparse deep neural networks for modeling aluminum electrolysis dynamics

Sparse deep neural networks for modeling aluminum electrolysis dynamics

URL: http://arxiv.org/abs/2209.05832v1
Date: Tue, 13 Sep 2022 09:11:50 GMT
Title: Sparse deep neural networks for modeling aluminum electrolysis dynamics
Authors: Erlend Torje Berg Lundby, Adil Rasheed, Ivar Johan Halvorsen, Jan Tommy Gravdahl
Abstract summary: We train sparse neural networks to model the system dynamics of an aluminum electrolysis simulator. The sparse model structure has a significantly reduction in model complexity compared to a corresponding dense neural network. The empirical study shows that the sparse models generalize better from small training sets than dense neural networks.
Score: 0.5257115841810257
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artificial neural networks have a broad array of applications today due to their high degree of flexibility and ability to model nonlinear functions from data. However, the trustworthiness of neural networks is limited due to their black-box nature, their poor ability to generalize from small datasets, and their inconsistent convergence during training. Aluminum electrolysis is a complex nonlinear process with many interrelated sub-processes. Artificial neural networks can potentially be well suited for modeling the aluminum electrolysis process, but the safety-critical nature of this process requires trustworthy models. In this work, sparse neural networks are trained to model the system dynamics of an aluminum electrolysis simulator. The sparse model structure has a significantly reduction in model complexity compared to a corresponding dense neural network. We argue that this makes the model more interpretable. Furthermore, the empirical study shows that the sparse models generalize better from small training sets than dense neural networks. Moreover, training an ensemble of sparse neural networks with different parameter initializations show that the models converge to similar model structures with similar learned input features.

Related papers

Multiscale Analysis of Woven Composites Using Hierarchical Physically Recurrent Neural Networks [0.0]
Multiscale homogenization of woven composites requires detailed micromechanical evaluations. This study introduces a Hierarchical Physically Recurrent Neural Network (HPRNN) employing two levels of surrogate modeling.
arXiv Detail & Related papers (2025-03-06T19:02:32Z)
Training Hybrid Neural Networks with Multimode Optical Nonlinearities Using Digital Twins [2.8479179029634984]
We introduce ultrashort pulse propagation in multimode fibers, which perform large-scale nonlinear transformations. Training the hybrid architecture is achieved through a neural model that differentiably approximates the optical system. Our experimental results achieve state-of-the-art image classification accuracies and simulation fidelity.
arXiv Detail & Related papers (2025-01-14T10:35:18Z)
Physics-Informed Neural Networks with Hard Linear Equality Constraints [9.101849365688905]
This work proposes a novel physics-informed neural network, KKT-hPINN, which rigorously guarantees hard linear equality constraints. Experiments on Aspen models of a stirred-tank reactor unit, an extractive distillation subsystem, and a chemical plant demonstrate that this model can further enhance the prediction accuracy.
arXiv Detail & Related papers (2024-02-11T17:40:26Z)
On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models. We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Quadratic models for understanding catapult dynamics of neural networks [15.381097076708535]
We show that recently proposed Neural Quadratic Models can exhibit the "catapult phase" that arises when training such models with large learning rates. Our analysis further demonstrates that quadratic models can be an effective tool for analysis of neural networks.
arXiv Detail & Related papers (2022-05-24T05:03:06Z)
Physics guided neural networks for modelling of non-linear dynamics [0.0]
This work demonstrates that injection of partially known information at an intermediate layer in a deep neural network can improve model accuracy, reduce model uncertainty, and yield improved convergence during the training. The value of these physics-guided neural networks has been demonstrated by learning the dynamics of a wide variety of nonlinear dynamical systems represented by five well-known equations in nonlinear systems theory.
arXiv Detail & Related papers (2022-05-13T19:06:36Z)
EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting. We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z)
Sobolev training of thermodynamic-informed neural networks for smoothed elasto-plasticity models with level set hardening [0.0]
We introduce a deep learning framework designed to train smoothed elastoplasticity models with interpretable components. By recasting the yield function as an evolving level set, we introduce a machine learning approach to predict the solutions of the Hamilton-Jacobi equation.
arXiv Detail & Related papers (2020-10-15T22:43:32Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
Flexible Transmitter Network [84.90891046882213]
Current neural networks are mostly built upon the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons. We propose the Flexible Transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity. We present the Flexible Transmitter Network (FTNet), which is built on the most common fully-connected feed-forward architecture.
arXiv Detail & Related papers (2020-04-08T06:55:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.