Optimistic Estimate Uncovers the Potential of Nonlinear Models
- URL: http://arxiv.org/abs/2307.08921v1
- Date: Tue, 18 Jul 2023 01:37:57 GMT
- Title: Optimistic Estimate Uncovers the Potential of Nonlinear Models
- Authors: Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo,
Zhi-Qin John Xu
- Abstract summary: We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.
We estimate the optimistic sample sizes for matrix factorization models, deep models, and deep neural networks (DNNs) with fully-connected or convolutional architecture.
- Score: 3.0041514772139166
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an optimistic estimate to evaluate the best possible fitting
performance of nonlinear models. It yields an optimistic sample size that
quantifies the smallest possible sample size to fit/recover a target function
using a nonlinear model. We estimate the optimistic sample sizes for matrix
factorization models, deep models, and deep neural networks (DNNs) with
fully-connected or convolutional architecture. For each nonlinear model, our
estimates predict a specific subset of targets that can be fitted at
overparameterization, which are confirmed by our experiments. Our optimistic
estimate reveals two special properties of the DNN models -- free
expressiveness in width and costly expressiveness in connection. These
properties suggest the following architecture design principles of DNNs: (i)
feel free to add neurons/kernels; (ii) restrain from connecting neurons.
Overall, our optimistic estimate theoretically unveils the vast potential of
nonlinear models in fitting at overparameterization. Based on this framework,
we anticipate gaining a deeper understanding of how and why numerous nonlinear
models such as DNNs can effectively realize their potential in practice in the
near future.
Related papers
- Bayesian Entropy Neural Networks for Physics-Aware Prediction [14.705526856205454]
We introduce BENN, a framework designed to impose constraints on Bayesian Neural Network (BNN) predictions.
Benn is capable of constraining not only the predicted values but also their derivatives and variances, ensuring a more robust and reliable model output.
Results highlight significant improvements over traditional BNNs and showcase competitive performance relative to contemporary constrained deep learning methods.
arXiv Detail & Related papers (2024-07-01T07:00:44Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Deep Neural Networks for Semiparametric Frailty Models via H-likelihood [0.0]
We propose a new deep neural network based frailty (DNN-FM) for prediction of time-to-event data.
Joint estimators of the new h-likelihood model provide maximum likelihood for fixed parameters and best unbiased predictors for random frailties.
arXiv Detail & Related papers (2023-07-13T06:46:51Z) - Linear Stability Hypothesis and Rank Stratification for Nonlinear Models [3.0041514772139166]
We propose a rank stratification for general nonlinear models to uncover a model rank as an "effective size of parameters"
By these results, model rank of a target function predicts a minimal training data size for its successful recovery.
arXiv Detail & Related papers (2022-11-21T16:27:25Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Adaptive deep learning for nonlinear time series models [0.0]
We develop a theory for adaptive nonparametric estimation of the mean function of a non-stationary and nonlinear time series model using deep neural networks (DNNs)
We derive minimax lower bounds for estimating mean functions belonging to a wide class of nonlinear autoregressive (AR) models.
arXiv Detail & Related papers (2022-07-06T09:58:58Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Topology optimization of 2D structures with nonlinearities using deep
learning [0.0]
Cloud computing has made it possible to search for optimal nonlinear structures.
We develop convolutional neural network models to predict optimized designs.
The developed models are capable of accurately predicting the optimized designs without requiring an iterative scheme.
arXiv Detail & Related papers (2020-01-31T12:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.