DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural
Networks
- URL: http://arxiv.org/abs/2011.00417v2
- Date: Mon, 25 Jan 2021 00:50:23 GMT
- Title: DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural
Networks
- Authors: Shiyun Xu, Zhiqi Bu
- Abstract summary: We incorporate over- parameterized neural networks into semi-parametric models to bridge the gap between inference and prediction.
We show the theoretical foundations that make this possible and demonstrate with numerical experiments.
We propose a framework, DebiNet, in which we plug-in arbitrary feature selection methods to our semi-parametric neural network.
- Score: 11.04121146441257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have witnessed strong empirical performance of
over-parameterized neural networks on various tasks and many advances in the
theory, e.g. the universal approximation and provable convergence to global
minimum. In this paper, we incorporate over-parameterized neural networks into
semi-parametric models to bridge the gap between inference and prediction,
especially in the high dimensional linear problem. By doing so, we can exploit
a wide class of networks to approximate the nuisance functions and to estimate
the parameters of interest consistently. Therefore, we may offer the best of
two worlds: the universal approximation ability from neural networks and the
interpretability from classic ordinary linear model, leading to both valid
inference and accurate prediction. We show the theoretical foundations that
make this possible and demonstrate with numerical experiments. Furthermore, we
propose a framework, DebiNet, in which we plug-in arbitrary feature selection
methods to our semi-parametric neural network. DebiNet can debias the
regularized estimators (e.g. Lasso) and perform well, in terms of the
post-selection inference and the generalization error.
Related papers
- Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning.
This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Over-parameterised Shallow Neural Networks with Asymmetrical Node
Scaling: Global Convergence Guarantees and Feature Learning [23.47570704524471]
We consider optimisation of large and shallow neural networks via gradient flow, where the output of each hidden node is scaled by some positive parameter.
We prove that, for large neural networks, with high probability, gradient flow converges to a global minimum AND can learn features, unlike in the NTK regime.
arXiv Detail & Related papers (2023-02-02T10:40:06Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - ExSpliNet: An interpretable and expressive spline-based neural network [0.3867363075280544]
We present ExSpliNet, an interpretable and expressive neural network model.
We give a probabilistic interpretation of the model and show its universal approximation properties.
arXiv Detail & Related papers (2022-05-03T14:06:36Z) - Acceleration techniques for optimization over trained neural network
ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation.
We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z) - Measurement error models: from nonparametric methods to deep neural
networks [3.1798318618973362]
We propose an efficient neural network design for estimating measurement error models.
We use a fully connected feed-forward neural network to approximate the regression function $f(x)$.
We conduct an extensive numerical study to compare the neural network approach with classical nonparametric methods.
arXiv Detail & Related papers (2020-07-15T06:05:37Z) - Generalization bound of globally optimal non-convex neural network
training: Transportation map estimation by infinite dimensional Langevin
dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error.
Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.