On the existence of optimal shallow feedforward networks with ReLU
activation
- URL: http://arxiv.org/abs/2303.03950v1
- Date: Mon, 6 Mar 2023 13:35:46 GMT
- Title: On the existence of optimal shallow feedforward networks with ReLU
activation
- Authors: Steffen Dereich and Sebastian Kassing
- Abstract summary: We prove existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation.
We propose a kind of closure of the search space so that in the extended space minimizers exist.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove existence of global minima in the loss landscape for the
approximation of continuous target functions using shallow feedforward
artificial neural networks with ReLU activation. This property is one of the
fundamental artifacts separating ReLU from other commonly used activation
functions. We propose a kind of closure of the search space so that in the
extended space minimizers exist. In a second step, we show under mild
assumptions that the newly added functions in the extension perform worse than
appropriate representable ReLU networks. This then implies that the optimal
response in the extended target space is indeed the response of a ReLU network.
Related papers
- ReCA: A Parametric ReLU Composite Activation Function [0.0]
Activation functions have been shown to affect the performance of deep neural networks significantly.
We propose a novel parametric activation function, ReCA, which has been shown to outperform all baselines on state-of-the-art datasets.
arXiv Detail & Related papers (2025-04-11T22:05:57Z) - Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization [3.3998740964877463]
"Local linear recovery" (LLR) is a weaker form of target function recovery.
We prove that functions expressible by narrower DNNs are guaranteed to be recoverable from fewer samples than model parameters.
arXiv Detail & Related papers (2024-06-26T03:08:24Z) - Large Deviations of Gaussian Neural Networks with ReLU activation [0.0]
We prove a large deviation principle for deep neural networks with Gaussian weights and at most linearly growing activation functions, such as ReLU.<n>We simplify previous expressions for the rate function and provide a power-series expansions for the ReLU case.
arXiv Detail & Related papers (2024-05-27T08:53:24Z) - Approximation Error and Complexity Bounds for ReLU Networks on Low-Regular Function Spaces [0.0]
We consider the approximation of a large class of bounded functions, with minimal regularity assumptions, by ReLU neural networks.
We show that the approximation error can be bounded from above by a quantity proportional to the uniform norm of the target function.
arXiv Detail & Related papers (2024-05-10T14:31:58Z) - Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks.
Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z) - Expressivity and Approximation Properties of Deep Neural Networks with
ReLU$^k$ Activation [2.3020018305241337]
We investigate the expressivity and approximation properties of deep networks employing the ReLU$k$ activation function for $k geq 2$.
Although deep ReLU$k$ networks can approximates effectively, deep ReLU$k$ networks have the capability to represent higher-degrees precisely.
arXiv Detail & Related papers (2023-12-27T09:11:14Z) - Reverse Engineering Deep ReLU Networks An Optimization-based Algorithm [0.0]
We present a novel method for reconstructing deep ReLU networks by leveraging convex optimization techniques and a sampling-based approach.
Our research contributes to the growing body of work on reverse engineering deep ReLU networks and paves the way for new advancements in neural network interpretability and security.
arXiv Detail & Related papers (2023-12-07T20:15:06Z) - Generalized Activation via Multivariate Projection [46.837481855573145]
Activation functions are essential to introduce nonlinearity into neural networks.
We consider ReLU as a projection from R onto the nonnegative half-line R+.
We extend ReLU by substituting it with a generalized projection operator onto a convex cone, such as the Second-Order Cone (SOC) projection.
arXiv Detail & Related papers (2023-09-29T12:44:27Z) - The Implicit Bias of Minima Stability in Multivariate Shallow ReLU
Networks [53.95175206863992]
We study the type of solutions to which gradient descent converges when used to train a single hidden-layer multivariate ReLU network with the quadratic loss.
We prove that although shallow ReLU networks are universal approximators, stable shallow networks are not.
arXiv Detail & Related papers (2023-06-30T09:17:39Z) - Optimal Sets and Solution Paths of ReLU Networks [56.40911684005949]
We develop an analytical framework to characterize the set of optimal ReLU networks.
We establish conditions for the neuralization of ReLU networks to be continuous, and develop sensitivity results for ReLU networks.
arXiv Detail & Related papers (2023-05-31T18:48:16Z) - On the existence of minimizers in shallow residual ReLU neural network optimization landscapes [3.6185342807265415]
We show existence of minimizers in the loss landscape for residual artificial neural networks (ANNs) with multi-dimensional input layer and one hidden layer with ReLU activation.
arXiv Detail & Related papers (2023-02-28T16:01:38Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.