The Geometry of ReLU Networks through the ReLU Transition Graph
- URL: http://arxiv.org/abs/2505.11692v2
- Date: Wed, 28 May 2025 21:46:04 GMT
- Title: The Geometry of ReLU Networks through the ReLU Transition Graph
- Authors: Sahil Rajesh Dhayalkar,
- Abstract summary: We develop a novel theoretical framework for analyzing ReLU neural networks through the lens of an object we term the ReLU Transition Graph (RTG)<n>In this graph, each node corresponds to a linear region induced by the network's activation patterns, and edges connect regions that differ by a single neuron flip.<n>We derive a suite of new theoretical results connecting RTG geometry to expressivity, generalization, and robustness.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a novel theoretical framework for analyzing ReLU neural networks through the lens of a combinatorial object we term the ReLU Transition Graph (RTG). In this graph, each node corresponds to a linear region induced by the network's activation patterns, and edges connect regions that differ by a single neuron flip. Building on this structure, we derive a suite of new theoretical results connecting RTG geometry to expressivity, generalization, and robustness. Our contributions include tight combinatorial bounds on RTG size and diameter, a proof of RTG connectivity, and graph-theoretic interpretations of VC-dimension. We also relate entropy and average degree of the RTG to generalization error. Each theoretical result is rigorously validated via carefully controlled experiments across varied network depths, widths, and data regimes. This work provides the first unified treatment of ReLU network structure via graph theory and opens new avenues for compression, regularization, and complexity control rooted in RTG analysis.
Related papers
- Learning the Structure of Connection Graphs [13.687470962704744]
Connection graphs (CGs) extend traditional graph models by coupling network topology with transformations, enabling the representation of global geometric consistency.<n>We propose a principled framework based on maximum pseudo-likelihood under a consistency assumption, which enforces spectral properties linking the connection Laplacian to the underlying Laplacian.<n>We introduce the Structured Connection Graph Learning (SCGL) algorithm, a block-optimization procedure that jointly infers network topology, edge weights, and geometric structure.
arXiv Detail & Related papers (2025-10-13T10:33:31Z) - PAC-Bayesian Generalization Bounds for Graph Convolutional Networks on Inductive Node Classification [20.5924958759845]
We present a PAC-Bayesian theoretical analysis of graph convolutional networks (GCNs) for inductive node classification.<n>We derive novel generalization bounds for one-layer GCNs that explicitly incorporate the effects of data dependency and non-stationarity.<n>We extend our analysis to two-layer GCNs, and reveal that it requires stronger assumptions on graph topology to guarantee convergence.
arXiv Detail & Related papers (2025-09-08T12:10:54Z) - Discrete Functional Geometry of ReLU Networks via ReLU Transition Graphs [0.0]
We extend the ReLU Transition Graph (RTG) framework into a comprehensive graph-theoretic model for understanding deep ReLU networks.<n>In this model, each node represents a linear activation region, and edges connect regions that differ by a single ReLU activation flip.
arXiv Detail & Related papers (2025-09-03T06:38:22Z) - Generalization of Geometric Graph Neural Networks [84.01980526069075]
We study the generalization capabilities of geometric graph neural networks (GNNs)
We prove a generalization gap between the optimal empirical risk and the optimal statistical risk of this GNN.
The most important observation is that the generalization capability can be realized with one large graph instead of being limited to the size of the graph as in previous results.
arXiv Detail & Related papers (2024-09-08T18:55:57Z) - Generalization Error of Graph Neural Networks in the Mean-field Regime [10.35214360391282]
We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks.
Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks.
arXiv Detail & Related papers (2024-02-10T19:12:31Z) - The Geometric Structure of Fully-Connected ReLU Layers [0.0]
We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks.
We provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries.
arXiv Detail & Related papers (2023-10-05T11:54:07Z) - Relation Embedding based Graph Neural Networks for Handling
Heterogeneous Graph [58.99478502486377]
We propose a simple yet efficient framework to make the homogeneous GNNs have adequate ability to handle heterogeneous graphs.
Specifically, we propose Relation Embedding based Graph Neural Networks (RE-GNNs), which employ only one parameter per relation to embed the importance of edge type relations and self-loop connections.
arXiv Detail & Related papers (2022-09-23T05:24:18Z) - Complex-Value Spatio-temporal Graph Convolutional Neural Networks and
its Applications to Electric Power Systems AI [24.914412344973996]
We generalize graph convolutional neural networks (GCN) to the complex domain.
We prove that complex-valued GCNs are stable with respect to perturbations of the underlying graph support.
We apply complex GCN to power grid state forecasting, power grid-attack detection and localization.
arXiv Detail & Related papers (2022-08-17T18:56:48Z) - Generalization Guarantee of Training Graph Convolutional Networks with
Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data.
To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs.
This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Scattering GCN: Overcoming Oversmoothness in Graph Convolutional
Networks [0.0]
Graph convolutional networks (GCNs) have shown promising results in processing graph data by extracting structure-aware features.
Here, we propose to augment conventional GCNs with geometric scattering transforms and residual convolutions.
The former enables band-pass filtering of graph signals, thus alleviating the so-called oversmoothing often encountered in GCNs.
arXiv Detail & Related papers (2020-03-18T18:03:08Z) - Understanding Graph Neural Networks with Generalized Geometric
Scattering Transforms [67.88675386638043]
The scattering transform is a multilayered wavelet-based deep learning architecture that acts as a model of convolutional neural networks.
We introduce windowed and non-windowed geometric scattering transforms for graphs based upon a very general class of asymmetric wavelets.
We show that these asymmetric graph scattering transforms have many of the same theoretical guarantees as their symmetric counterparts.
arXiv Detail & Related papers (2019-11-14T17:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.