Loop Polarity Analysis to Avoid Underspecification in Deep Learning
- URL: http://arxiv.org/abs/2309.10211v2
- Date: Wed, 29 May 2024 19:33:12 GMT
- Title: Loop Polarity Analysis to Avoid Underspecification in Deep Learning
- Authors: Donald Martin, Jr., David Kinney,
- Abstract summary: In this paper, we turn to loop polarity analysis as a tool for specifying the causal structure of a data-generating process.
We show how measuring the polarity of the different feedback loops that compose a system can lead to more robust inferences on the part of neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning is a powerful set of techniques for detecting complex patterns in data. However, when the causal structure of that process is underspecified, deep learning models can be brittle, lacking robustness to shifts in the distribution of the data-generating process. In this paper, we turn to loop polarity analysis as a tool for specifying the causal structure of a data-generating process, in order to encode a more robust understanding of the relationship between system structure and system behavior within the deep learning pipeline. We use simulated epidemic data based on an SIR model to demonstrate how measuring the polarity of the different feedback loops that compose a system can lead to more robust inferences on the part of neural networks, improving the out-of-distribution performance of a deep learning model and infusing a system-dynamics-inspired approach into the machine learning development pipeline.
Related papers
- Machine learning for structural design models of continuous beam systems via influence zones [3.284878354988896]
This work develops a machine learned structural design model for continuous beam systems from the inverse problem perspective.
The aim of this approach is to conceptualise a non-iterative structural design model that predicts cross-section requirements for continuous beam systems of arbitrary system size.
arXiv Detail & Related papers (2024-03-14T14:53:18Z) - Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic
System Identification with Application to Audio Processing [4.599180419117645]
Interpretability helps us understand a model's ability to generalize and reveal its limitations.
We introduce a causal interpretable deep structure for modeling dynamic systems.
Our proposed model makes use of the harmonic analysis by modeling the system in a time-frequency domain.
arXiv Detail & Related papers (2023-10-10T21:32:15Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent
Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies.
A neural network is employed to act as structure prior and reveal the underlying signal interdependencies.
Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Soft Sensing Model Visualization: Fine-tuning Neural Network from What
Model Learned [5.182947614447375]
Data-driven soft-sensing modeling has become more prevalent in wafer process diagnostics.
Deep learning has been utilized in soft sensing system with promising performance on highly nonlinear and dynamic time-series data.
In this paper, we propose a deep learning-based model for defective wafer detection using a highly imbalanced dataset.
arXiv Detail & Related papers (2021-11-12T23:32:06Z) - Using scientific machine learning for experimental bifurcation analysis
of dynamic systems [2.204918347869259]
This study focuses on training universal differential equation (UDE) models for physical nonlinear dynamical systems with limit cycles.
We consider examples where training data is generated by numerical simulations, whereas we also employ the proposed modelling concept to physical experiments.
We use both neural networks and Gaussian processes as universal approximators alongside the mechanistic models to give a critical assessment of the accuracy and robustness of the UDE modelling approach.
arXiv Detail & Related papers (2021-10-22T15:43:03Z) - Model discovery in the sparse sampling regime [0.0]
We show how deep learning can improve model discovery of partial differential equations.
As a result, deep learning-based model discovery allows to recover the underlying equations.
We illustrate our claims on both synthetic and experimental sets.
arXiv Detail & Related papers (2021-05-02T06:27:05Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.