Measuring Model Complexity of Neural Networks with Curve Activation
Functions
- URL: http://arxiv.org/abs/2006.08962v1
- Date: Tue, 16 Jun 2020 07:38:06 GMT
- Title: Measuring Model Complexity of Neural Networks with Curve Activation
Functions
- Authors: Xia Hu, Weiqing Liu, Jiang Bian, Jian Pei
- Abstract summary: We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
- Score: 100.98319505253797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is fundamental to measure model complexity of deep neural networks. The
existing literature on model complexity mainly focuses on neural networks with
piecewise linear activation functions. Model complexity of neural networks with
general curve activation functions remains an open problem. To tackle the
challenge, in this paper, we first propose the linear approximation neural
network (LANN for short), a piecewise linear framework to approximate a given
deep model with curve activation function. LANN constructs individual piecewise
linear approximation for the activation function of each neuron, and minimizes
the number of linear regions to satisfy a required approximation degree. Then,
we analyze the upper bound of the number of linear regions formed by LANNs, and
derive the complexity measure based on the upper bound. To examine the
usefulness of the complexity measure, we experimentally explore the training
process of neural networks and detect overfitting. Our results demonstrate that
the occurrence of overfitting is positively correlated with the increase of
model complexity during training. We find that the $L^1$ and $L^2$
regularizations suppress the increase of model complexity. Finally, we propose
two approaches to prevent overfitting by directly constraining model
complexity, namely neuron pruning and customized $L^1$ regularization.
Related papers
- On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models.
We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z) - Efficient SGD Neural Network Training via Sublinear Activated Neuron
Identification [22.361338848134025]
We present a fully connected two-layer neural network for shifted ReLU activation to enable activated neuron identification in sublinear time via geometric search.
We also prove that our algorithm can converge in $O(M2/epsilon2)$ time with network size quadratic in the coefficient norm upper bound $M$ and error term $epsilon$.
arXiv Detail & Related papers (2023-07-13T05:33:44Z) - Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO
Regularization [15.517787031620864]
The territory of LASSO is extended to two-layer ReLU neural networks, a fashionable and powerful nonlinear regression model.
We show that the LASSO estimator can stably reconstruct the neural network and identify $mathcalSstar$ when the number of samples scales logarithmically.
Our theory lies in an extended Restricted Isometry Property (RIP)-based analysis framework for two-layer ReLU neural networks.
arXiv Detail & Related papers (2023-05-07T13:05:09Z) - Simultaneous approximation of a smooth function and its derivatives by
deep neural networks with piecewise-polynomial activations [2.15145758970292]
We derive the required depth, width, and sparsity of a deep neural network to approximate any H"older smooth function up to a given approximation error in H"older norms.
The latter feature is essential to control generalization errors in many statistical and machine learning applications.
arXiv Detail & Related papers (2022-06-20T01:18:29Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve
Optimism, Embrace Virtual Curvature [61.22680308681648]
We show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward.
For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL)
arXiv Detail & Related papers (2021-02-08T12:41:56Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.