Related papers: Combinatorial Regularity for Relatively Perfect Discrete Morse Gradient Vector Fields of ReLU Neural Networks

Combinatorial Regularity for Relatively Perfect Discrete Morse Gradient Vector Fields of ReLU Neural Networks

URL: http://arxiv.org/abs/2412.18005v2
Date: Wed, 22 Jan 2025 23:27:18 GMT
Title: Combinatorial Regularity for Relatively Perfect Discrete Morse Gradient Vector Fields of ReLU Neural Networks
Authors: Robyn Brooks, Marissa Masden,
Abstract summary: ReLU neural networks induce a piecewise linear decomposition of their input space called the canonical polyhedral complex.<n>It has previously been established that it is decidable whether a ReLU neural network is piecewise linear Morse.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: One common function class in machine learning is the class of ReLU neural networks. ReLU neural networks induce a piecewise linear decomposition of their input space called the canonical polyhedral complex. It has previously been established that it is decidable whether a ReLU neural network is piecewise linear Morse. In order to expand computational tools for analyzing the topological properties of ReLU neural networks, and to harness the strengths of discrete Morse theory, we introduce a schematic for translating between a given piecewise linear Morse function (e.g. parameters of a ReLU neural network) on a canonical polyhedral complex and a compatible (``relatively perfect") discrete Morse function on the same complex. Our approach is constructive, producing an algorithm that can be used to determine if a given vertex in a canonical polyhedral complex corresponds to a piecewise linear Morse critical point. Furthermore we provide an algorithm for constructing a consistent discrete Morse pairing on cells in the canonical polyhedral complex which contain this vertex. We additionally provide some new realizability results with respect to sublevel set topology in the case of shallow ReLU neural networks.

Related papers

FACT: the Features At Convergence Theorem for neural networks [11.578851413831247]
We prove the Features at Convergence Theorem (FACT), which gives a self-consistency equation that neural network weights satisfy at convergence when trained with nonzero weight decay.<n>We validate this relation empirically, showing that neural features indeed satisfy the FACT at convergence.<n>By modifying the "Recursive Feature Machines" of Radhakrishnan et al. 2024 so that they obey the FACT, we arrive at a new learning algorithm, FACT-RFM.
arXiv Detail & Related papers (2025-07-08T03:52:48Z)
Polyhedral Complex Extraction from ReLU Networks using Edge Subdivision [0.0]
A neural network consists of piecewise affine building blocks, such as fully-connected layers and ReLU activations. This complex has been previously studied to characterize theoretical properties of neural networks. We propose to subdivide the regions via intersections with hyperplanes induced by each neuron.
arXiv Detail & Related papers (2023-06-12T16:17:04Z)
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z)
On Rademacher Complexity-based Generalization Bounds for Deep Learning [18.601449856300984]
We show that the Rademacher complexity-based approach can generate non-vacuous generalisation bounds on Convolutional Neural Networks (CNNs) Our results show that the Rademacher complexity does not depend on the network length for CNNs with some special types of activation functions such as ReLU, Leaky ReLU, Parametric Rectifier Linear Unit, Sigmoid, and Tanh.
arXiv Detail & Related papers (2022-08-08T17:24:04Z)
Local and global topological complexity measures OF ReLU neural network functions [0.0]
We apply a piecewise-linear (PL) version of Morse theory due to Grunert-Kuhnel-Rote to define and study new local and global notions of topological complexity. We show how to construct, for each such F, a canonical polytopal complex K(F) and a deformation retract of the domain onto K(F), yielding a convenient compact model for performing calculations.
arXiv Detail & Related papers (2022-04-12T19:49:13Z)
A Sparse Coding Interpretation of Neural Networks and Theoretical Implications [0.0]
Deep convolutional neural networks have achieved unprecedented performance in various computer vision tasks. We propose a sparse coding interpretation of neural networks that have ReLU activation. We derive a complete convolutional neural network without normalization and pooling.
arXiv Detail & Related papers (2021-08-14T21:54:47Z)
Optimal Approximation with Sparse Neural Networks and Applications [0.0]
We use deep sparsely connected neural networks to measure the complexity of a function class in $L(mathbb Rd)$. We also introduce representation system - a countable collection of functions to guide neural networks. We then analyse the complexity of a class called $beta$ cartoon-like functions using rate-distortion theory and wedgelets construction.
arXiv Detail & Related papers (2021-08-14T05:14:13Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow. We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.