Related papers: Constraining the outputs of ReLU neural networks

Constraining the outputs of ReLU neural networks

URL: http://arxiv.org/abs/2508.03867v1
Date: Tue, 05 Aug 2025 19:30:11 GMT
Title: Constraining the outputs of ReLU neural networks
Authors: Yulia Alexandr, Guido Montúfar,
Abstract summary: We introduce a class of algebraic varieties naturally associated with ReLU neural networks.<n>By analyzing the rank constraints on the network outputs within each activation region, we derive a structure that characterizes the functions representable by the network.
Score: 13.645092880691188
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a class of algebraic varieties naturally associated with ReLU neural networks, arising from the piecewise linear structure of their outputs across activation regions in input space, and the piecewise multilinear structure in parameter space. By analyzing the rank constraints on the network outputs within each activation region, we derive polynomial equations that characterize the functions representable by the network. We further investigate conditions under which these varieties attain their expected dimension, providing insight into the expressive and structural properties of ReLU networks.

Related papers

Regional, Lattice and Logical Representations of Neural Networks [0.5279873919047532]
We present an algorithm for the translation of feedforward neural networks with ReLU activation functions in hidden layers and truncated identity activation functions in the output layer.<n>We also empirically investigate the complexity of regional representations outputted by our method for neural networks with varying sizes.
arXiv Detail & Related papers (2025-06-06T07:58:09Z)
On the Local Complexity of Linear Regions in Deep ReLU Networks [0.0]
We show theoretically that ReLU networks that learn low-dimensional feature representations have a lower local complexity.<n>In particular, we show that the local complexity serves as an upper bound on the total variation of the function over the input data distribution.
arXiv Detail & Related papers (2024-12-24T08:42:39Z)
The Evolution of the Interplay Between Input Distributions and Linear Regions in Networks [20.97553518108504]
We count the number of linear convex regions in deep neural networks based on ReLU. In particular, we prove that for any one-dimensional input, there exists a minimum threshold for the number of neurons required to express it. We also unveil the iterative refinement process of decision boundaries in ReLU networks during training.
arXiv Detail & Related papers (2023-10-28T15:04:53Z)
The Geometric Structure of Fully-Connected ReLU Layers [0.0]
We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. We provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries.
arXiv Detail & Related papers (2023-10-05T11:54:07Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
Towards Understanding Theoretical Advantages of Complex-Reaction Networks [77.34726150561087]
We show that a class of functions can be approximated by a complex-reaction network using the number of parameters. For empirical risk minimization, our theoretical result shows that the critical point set of complex-reaction networks is a proper subset of that of real-valued networks.
arXiv Detail & Related papers (2021-08-15T10:13:49Z)
Locally Linear Attributes of ReLU Neural Networks [2.218917829443032]
A ReLU neural network determines/is a continuous piecewise linear map from an input space to an output space. The weights in the neural network determine a decomposition of the input space into convex polytopes. On each of these polytopes the network can be described by a single affine mapping.
arXiv Detail & Related papers (2020-11-30T19:31:23Z)
Stability of Algebraic Neural Networks to Small Perturbations [179.55535781816343]
Algebraic neural networks (AlgNNs) are composed of a cascade of layers each one associated to and algebraic signal model. We show how any architecture that uses a formal notion of convolution can be stable beyond particular choices of the shift operator.
arXiv Detail & Related papers (2020-10-22T09:10:16Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.