AMITE: A Novel Polynomial Expansion for Analyzing Neural Network
Nonlinearities
- URL: http://arxiv.org/abs/2007.06226v5
- Date: Wed, 24 Nov 2021 03:41:31 GMT
- Title: AMITE: A Novel Polynomial Expansion for Analyzing Neural Network
Nonlinearities
- Authors: Mauro J. Sanchirico III, Xun Jiao and C. Nataraj
- Abstract summary: Polynomial expansions are important in the analysis of neural network nonlinearities.
Existing approaches span classical Taylor and Chebyshev methods.
There are no approaches that provide a consistent method an expansion with all these properties.
- Score: 1.8761314918771685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Polynomial expansions are important in the analysis of neural network
nonlinearities. They have been applied thereto addressing well-known
difficulties in verification, explainability, and security. Existing approaches
span classical Taylor and Chebyshev methods, asymptotics, and many numerical
approaches. We find that while these individually have useful properties such
as exact error formulas, adjustable domain, and robustness to undefined
derivatives, there are no approaches that provide a consistent method yielding
an expansion with all these properties. To address this, we develop an
analytically modified integral transform expansion (AMITE), a novel expansion
via integral transforms modified using derived criteria for convergence. We
show the general expansion and then demonstrate application for two popular
activation functions, hyperbolic tangent and rectified linear units. Compared
with existing expansions (i.e., Chebyshev, Taylor, and numerical) employed to
this end, AMITE is the first to provide six previously mutually exclusive
desired expansion properties such as exact formulas for the coefficients and
exact expansion errors (Table II). We demonstrate the effectiveness of AMITE in
two case studies. First, a multivariate polynomial form is efficiently
extracted from a single hidden layer black-box Multi-Layer Perceptron (MLP) to
facilitate equivalence testing from noisy stimulus-response pairs. Second, a
variety of Feed-Forward Neural Network (FFNN) architectures having between 3
and 7 layers are range bounded using Taylor models improved by the AMITE
polynomials and error formulas. AMITE presents a new dimension of expansion
methods suitable for analysis/approximation of nonlinearities in neural
networks, opening new directions and opportunities for the theoretical analysis
and systematic testing of neural networks.
Related papers
- Feature learning in finite-width Bayesian deep linear networks with multiple outputs and convolutional layers [39.71511919246829]
Deep linear networks have been extensively studied, but little is known in the case of finite-width architectures with multiple outputs and convolutional layers.
Our work provides a dictionary that translates this physics intuition and terminology into rigorous Bayesian statistics.
arXiv Detail & Related papers (2024-06-05T13:37:42Z) - Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance.
Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Covering Number of Real Algebraic Varieties and Beyond: Improved Bounds and Applications [8.438718130535296]
We prove upper bounds on the covering number of sets in Euclidean space.
We show that bounds improve the best known general bound by Yomdin-Comte.
We illustrate the power of the result on three computational applications.
arXiv Detail & Related papers (2023-11-09T03:06:59Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Sparse Bayesian Learning for Complex-Valued Rational Approximations [0.03392423750246091]
Surrogate models are used to alleviate the computational burden in engineering tasks.
These models show a strongly non-linear dependence on their input parameters.
We apply a sparse learning approach to the rational approximation.
arXiv Detail & Related papers (2022-06-06T12:06:13Z) - A deep learning driven pseudospectral PCE based FFT homogenization
algorithm for complex microstructures [68.8204255655161]
It is shown that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
arXiv Detail & Related papers (2021-10-26T07:02:14Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - Stochastic spectral embedding [0.0]
We propose a novel sequential adaptive surrogate modeling method based on "stochastic spectral embedding" (SSE)
We show how the method compares favorably against state-of-the-art sparse chaos expansions on a set of models with different complexity and input dimension.
arXiv Detail & Related papers (2020-04-09T11:00:07Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.