Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach
- URL: http://arxiv.org/abs/2007.01290v3
- Date: Tue, 20 Oct 2020 16:56:32 GMT
- Title: Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach
- Authors: Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Zhaoran Wang, Mladen
Kolar
- Abstract summary: We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
- Score: 144.21892195917758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structural equation models (SEMs) are widely used in sciences, ranging from
economics to psychology, to uncover causal relationships underlying a complex
system under consideration and estimate structural parameters of interest. We
study estimation in a class of generalized SEMs where the object of interest is
defined as the solution to a linear operator equation. We formulate the linear
operator equation as a min-max game, where both players are parameterized by
neural networks (NNs), and learn the parameters of these neural networks using
the stochastic gradient descent. We consider both 2-layer and multi-layer NNs
with ReLU activation functions and prove global convergence in an
overparametrized regime, where the number of neurons is diverging. The results
are established using techniques from online learning and local linearization
of NNs, and improve in several aspects the current state-of-the-art. For the
first time we provide a tractable estimation procedure for SEMs based on NNs
with provable convergence and without the need for sample splitting.
Related papers
- Calibrating Neural Networks' parameters through Optimal Contraction in a Prediction Problem [0.0]
The paper details how a recurrent neural networks (RNN) can be transformed into a contraction in a domain where its parameters are linear.
It then demonstrates that a prediction problem modeled through an RNN, with a specific regularization term in the loss function, can have its first-order conditions expressed analytically.
We establish that, if certain conditions are met, optimal parameters exist, and can be found through a straightforward algorithm to any desired precision.
arXiv Detail & Related papers (2024-06-15T18:08:04Z) - Neural Parameter Regression for Explicit Representations of PDE Solution Operators [22.355460388065964]
We introduce Neural Regression (NPR), a novel framework specifically developed for learning solution operators in Partial Differential Equations (PDEs)
NPR employs Physics-Informed Neural Network (PINN, Raissi et al., 2021) techniques to regress Neural Network (NN) parameters.
The framework shows remarkable adaptability to new initial and boundary conditions, allowing for rapid fine-tuning and inference.
arXiv Detail & Related papers (2024-03-19T14:30:56Z) - Analyzing Populations of Neural Networks via Dynamical Model Embedding [10.455447557943463]
A core challenge in the interpretation of deep neural networks is identifying commonalities between the underlying algorithms implemented by distinct networks trained for the same task.
Motivated by this problem, we introduce DYNAMO, an algorithm that constructs low-dimensional manifold where each point corresponds to a neural network model, and two points are nearby if the corresponding neural networks enact similar high-level computational processes.
DYNAMO takes as input a collection of pre-trained neural networks and outputs a meta-model that emulates the dynamics of the hidden states as well as the outputs of any model in the collection.
arXiv Detail & Related papers (2023-02-27T19:00:05Z) - A Recursively Recurrent Neural Network (R2N2) Architecture for Learning
Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms.
We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.