Related papers: Functional dimension of feedforward ReLU neural networks

Functional dimension of feedforward ReLU neural networks

URL: http://arxiv.org/abs/2209.04036v1
Date: Thu, 8 Sep 2022 21:30:16 GMT
Title: Functional dimension of feedforward ReLU neural networks
Authors: J. Elisenda Grigsby, Kathryn Lindsey, Robert Meyerhoff, Chenxi Wu
Abstract summary: We show that functional dimension is inhomogeneous across the parameter space of ReLU neural network functions. We also study the quotient space and fibers of the realization map from parameter space to function space.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is well-known that the parameterized family of functions representable by fully-connected feedforward neural networks with ReLU activation function is precisely the class of piecewise linear functions with finitely many pieces. It is less well-known that for every fixed architecture of ReLU neural network, the parameter space admits positive-dimensional spaces of symmetries, and hence the local functional dimension near any given parameter is lower than the parametric dimension. In this work we carefully define the notion of functional dimension, show that it is inhomogeneous across the parameter space of ReLU neural network functions, and continue an investigation - initiated in [14] and [5] - into when the functional dimension achieves its theoretical maximum. We also study the quotient space and fibers of the realization map from parameter space to function space, supplying examples of fibers that are disconnected, fibers upon which functional dimension is non-constant, and fibers upon which the symmetry group acts non-transitively.

Related papers

Function Forms of Simple ReLU Networks with Random Hidden Weights [1.2289361708127877]
We investigate the function space dynamics of a two-layer ReLU neural network in the infinite-width limit.<n>We highlight the Fisher information matrix's role in steering learning.<n>This work offers a robust foundation for understanding wide neural networks.
arXiv Detail & Related papers (2025-05-23T13:53:02Z)
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations [50.010924231754856]
Adapting pre-trained foundation models for diverse downstream tasks is a core practice in artificial intelligence. To overcome this, parameter-efficient fine-tuning (PEFT) methods like LoRA have emerged and are becoming a growing research focus. We propose a generalization that extends matrix-based PEFT methods to higher-dimensional parameter spaces without compromising their structural properties.
arXiv Detail & Related papers (2025-04-01T14:36:45Z)
On Functional Dimension and Persistent Pseudodimension [0.0]
We discuss two locally applicable complexity measures for ReLU network classes and what we know about the relationship between them. The former is easy to compute on finite batches of points; the latter should give local bounds on the gap, which would inform an understanding of the mechanics of the double descent phenomenon.
arXiv Detail & Related papers (2024-10-22T17:12:21Z)
Geometry-induced Implicit Regularization in Deep ReLU Neural Networks [0.0]
Implicit regularization phenomena, which are still not well understood, occur during optimization. We study the geometry of the output set as parameters vary. We prove that the batch functional dimension is almost surely determined by the activation patterns in the hidden layers.
arXiv Detail & Related papers (2024-02-13T07:49:57Z)
Piecewise Linear Functions Representable with Infinite Width Shallow ReLU Neural Networks [0.0]
We prove a conjecture of Ongie et al. that every continuous piecewise linear function expressible with this kind of infinite width neural network is expressible as a finite width shallow ReLU neural network.
arXiv Detail & Related papers (2023-07-25T15:38:18Z)
Deep neural network approximation of composite functions without the curse of dimensionality [0.0]
In this article we identify a class of high-dimensional continuous functions that can be approximated by deep neural networks (DNNs) The functions in our class can be expressed as a potentially unbounded special functions which include products, maxima, and certain parallelized Lipschitz continuous functions.
arXiv Detail & Related papers (2023-04-12T12:08:59Z)
Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs) Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space. This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z)
Geometry of Linear Convolutional Networks [7.990816079551592]
We study the family of functions represented by a linear convolutional neural network (LCN) We study the optimization of an objective function over an LCN, analyzing critical points in function space and in gradient space. Overall, our theory predicts that the optimized parameters of an LCN will often correspond to repeated filters across layers.
arXiv Detail & Related papers (2021-08-03T14:42:18Z)
Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions. We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z)
A Functional Perspective on Learning Symmetric Functions with Neural Networks [48.80300074254758]
We study the learning and representation of neural networks defined on measures. We establish approximation and generalization bounds under different choices of regularization. The resulting models can be learned efficiently and enjoy generalization guarantees that extend across input sizes.
arXiv Detail & Related papers (2020-08-16T16:34:33Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Space of Functions Computed by Deep-Layered Machines [74.13735716675987]
We study the space of functions computed by random-layered machines, including deep neural networks and Boolean circuits. Investigating the distribution of Boolean functions computed on the recurrent and layer-dependent architectures, we find that it is the same in both models.
arXiv Detail & Related papers (2020-04-19T18:31:03Z)
Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier. A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.