Related papers: Discrete Functional Geometry of ReLU Networks via ReLU Transition Graphs

Discrete Functional Geometry of ReLU Networks via ReLU Transition Graphs

URL: http://arxiv.org/abs/2509.03056v2
Date: Thu, 04 Sep 2025 01:30:08 GMT
Title: Discrete Functional Geometry of ReLU Networks via ReLU Transition Graphs
Authors: Sahil Rajesh Dhayalkar,
Abstract summary: We extend the ReLU Transition Graph (RTG) framework into a comprehensive graph-theoretic model for understanding deep ReLU networks.<n>In this model, each node represents a linear activation region, and edges connect regions that differ by a single ReLU activation flip.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We extend the ReLU Transition Graph (RTG) framework into a comprehensive graph-theoretic model for understanding deep ReLU networks. In this model, each node represents a linear activation region, and edges connect regions that differ by a single ReLU activation flip, forming a discrete geometric structure over the network's functional behavior. We prove that RTGs at random initialization exhibit strong expansion, binomial degree distributions, and spectral properties that tightly govern generalization. These structural insights enable new bounds on capacity via region entropy and on generalization via spectral gap and edge-wise KL divergence. Empirically, we construct RTGs for small networks, measure their smoothness and connectivity properties, and validate theoretical predictions. Our results show that region entropy saturates under overparameterization, spectral gap correlates with generalization, and KL divergence across adjacent regions reflects functional smoothness. This work provides a unified framework for analyzing ReLU networks through the lens of discrete functional geometry, offering new tools to understand, diagnose, and improve generalization.

Related papers

Topology and Geometry of the Learning Space of ReLU Networks: Connectivity and Singularities [4.110453843035319]
We show that singularities are intricately connected to the topology of the underlying DAG and its induced sub-networks.<n>We discuss the reachability of these singularities and establish a principled connection with differentiable pruning.
arXiv Detail & Related papers (2026-01-31T12:30:31Z)
Learning the Structure of Connection Graphs [13.687470962704744]
Connection graphs (CGs) extend traditional graph models by coupling network topology with transformations, enabling the representation of global geometric consistency.<n>We propose a principled framework based on maximum pseudo-likelihood under a consistency assumption, which enforces spectral properties linking the connection Laplacian to the underlying Laplacian.<n>We introduce the Structured Connection Graph Learning (SCGL) algorithm, a block-optimization procedure that jointly infers network topology, edge weights, and geometric structure.
arXiv Detail & Related papers (2025-10-13T10:33:31Z)
Constraining the outputs of ReLU neural networks [13.645092880691188]
We introduce a class of algebraic varieties naturally associated with ReLU neural networks.<n>By analyzing the rank constraints on the network outputs within each activation region, we derive a structure that characterizes the functions representable by the network.
arXiv Detail & Related papers (2025-08-05T19:30:11Z)
The Geometry of ReLU Networks through the ReLU Transition Graph [0.0]
We develop a novel theoretical framework for analyzing ReLU neural networks through the lens of an object we term the ReLU Transition Graph (RTG)<n>In this graph, each node corresponds to a linear region induced by the network's activation patterns, and edges connect regions that differ by a single neuron flip.<n>We derive a suite of new theoretical results connecting RTG geometry to expressivity, generalization, and robustness.
arXiv Detail & Related papers (2025-05-16T21:00:56Z)
Geometric Neural Process Fields [58.77241763774756]
Geometric Neural Process Fields (G-NPF) is a probabilistic framework for neural radiance fields that explicitly captures uncertainty.<n>Building on these bases, we design a hierarchical latent variable model, allowing G-NPF to integrate structural information across multiple spatial levels.<n> Experiments on novel-view synthesis for 3D scenes, as well as 2D image and 1D signal regression, demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2025-02-04T14:17:18Z)
Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks. Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
The Geometric Structure of Fully-Connected ReLU Layers [0.0]
We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. We provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries.
arXiv Detail & Related papers (2023-10-05T11:54:07Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent. We show that SGD is biased towards a simple solution. We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z)
Spectral-Spatial Global Graph Reasoning for Hyperspectral Image Classification [50.899576891296235]
Convolutional neural networks have been widely applied to hyperspectral image classification. Recent methods attempt to address this issue by performing graph convolutions on spatial topologies.
arXiv Detail & Related papers (2021-06-26T06:24:51Z)
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to Ridgelet Spectrum [10.05944106581306]
We show that the distribution of the parameters converges to a spectrum of the ridgelet transform. This result provides a new insight into the characterization of the local minima of neural networks.
arXiv Detail & Related papers (2020-07-07T13:47:57Z)
Understanding Graph Neural Networks with Generalized Geometric Scattering Transforms [67.88675386638043]
The scattering transform is a multilayered wavelet-based deep learning architecture that acts as a model of convolutional neural networks. We introduce windowed and non-windowed geometric scattering transforms for graphs based upon a very general class of asymmetric wavelets. We show that these asymmetric graph scattering transforms have many of the same theoretical guarantees as their symmetric counterparts.
arXiv Detail & Related papers (2019-11-14T17:23:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.