Related papers: Beyond ReLU: Bifurcation, Oversmoothing, and Topological Priors

Beyond ReLU: Bifurcation, Oversmoothing, and Topological Priors

URL: http://arxiv.org/abs/2602.15634v1
Date: Tue, 17 Feb 2026 15:03:28 GMT
Title: Beyond ReLU: Bifurcation, Oversmoothing, and Topological Priors
Authors: Erkan Turan, Gaspard Abel, Maysam Behmanesh, Emery Pierson, Maks Ovsjanikov,
Abstract summary: Graph Neural Networks (GNNs) learn node representations through iterative network-based message-passing.<n>Deep GNNs suffer from oversmoothing, where node features converge to a homogeneous, non-informative state.<n>We re-frame this problem of representational collapse from a emphbifurcation theory perspective, characterizing oversmoothing as convergence to a stable homogeneous fixed point.
Score: 28.452964044443906
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Graph Neural Networks (GNNs) learn node representations through iterative network-based message-passing. While powerful, deep GNNs suffer from oversmoothing, where node features converge to a homogeneous, non-informative state. We re-frame this problem of representational collapse from a \emph{bifurcation theory} perspective, characterizing oversmoothing as convergence to a stable ``homogeneous fixed point.'' Our central contribution is the theoretical discovery that this undesired stability can be broken by replacing standard monotone activations (e.g., ReLU) with a class of functions. Using Lyapunov-Schmidt reduction, we analytically prove that this substitution induces a bifurcation that destabilizes the homogeneous state and creates a new pair of stable, non-homogeneous \emph{patterns} that provably resist oversmoothing. Our theory predicts a precise, nontrivial scaling law for the amplitude of these emergent patterns, which we quantitatively validate in experiments. Finally, we demonstrate the practical utility of our theory by deriving a closed-form, bifurcation-aware initialization and showing its utility in real benchmark experiments.

Related papers

Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks [29.858071115963472]
We show that homogeneous networks encompass fully-connected and neural convolutional neural networks with ReReRe activations.<n>This finding is broadly applicable, as homogeneous networks encompass fully-connected and neural networks with ReReRe activations.
arXiv Detail & Related papers (2026-02-26T12:26:32Z)
Random-Matrix-Induced Simplicity Bias in Over-parameterized Variational Quantum Circuits [72.0643009153473]
We show that expressive variational ansatze enter a Haar-like universality class in which both observable expectation values and parameter gradients concentrate exponentially with system size.<n>As a consequence, the hypothesis class induced by such circuits collapses with high probability to a narrow family of near-constant functions.<n>We further show that this collapse is not unavoidable: tensor-structured VQCs, including tensor-network-based and tensor-hypernetwork parameterizations, lie outside the Haar-like universality class.
arXiv Detail & Related papers (2026-01-05T08:04:33Z)
Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback [50.89125374999765]
We provide the first convergence guarantee for Optimistic Multiplicative Weights Update ($mathtOMWU$) in NLHF.<n>Our analysis identifies a novel marginal convergence behavior, where the probability of rarely played actions grows exponentially from exponentially small values.
arXiv Detail & Related papers (2025-12-31T12:08:29Z)
From Tail Universality to Bernstein-von Mises: A Unified Statistical Theory of Semi-Implicit Variational Inference [0.12183405753834557]
Semi-implicit variational inference (SIVI) constructs approximate posteriors of the form $q() = int k(| z) r(dz)$<n>This paper develops a unified "approximation-optimization-statistics'' theory for such families.
arXiv Detail & Related papers (2025-12-05T19:26:25Z)
Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning [0.22399170518036912]
Activation functions govern the expressivity and stability of neural networks.<n>We propose a rigorous framework for their classification via a nine-dimensional integral signature S_sigma(phi)<n>Our framework provides principled design guidance, moving activation choice from trial-and-error to provable stability and kernel conditioning.
arXiv Detail & Related papers (2025-10-09T17:03:00Z)
A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs [54.62268052283014]
We present a unified theoretical perspective based on the framework of signed graphs.<n>We show that many existing strategies implicitly introduce negative edges that alter message-passing to resist oversmoothing.<n>We propose Structural Balanced Propagation (SBP), a plug-and-play method that assigns signed edges based on either labels or feature similarity.
arXiv Detail & Related papers (2025-02-17T03:25:36Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks [3.566568169425391]
We show that with increased depth, node representations become dominated by a low-dimensional subspace that depends on the aggregation function but not on the feature transformations. For all aggregation functions, the rank of the node representations collapses, resulting in over-smoothing for particular aggregation functions.
arXiv Detail & Related papers (2023-08-31T15:22:31Z)
Analysis of Graph Neural Networks with Theory of Markov Chains [2.017675281820044]
We study emphover-smoothing which is an important problem in GNN research. We give the conclusion that operator-consistent GNN cannot avoid over-smoothing at an exponential rate in the Markovian sense. We propose a regularization term which can be flexibly added to the training of the neural network.
arXiv Detail & Related papers (2022-11-12T08:03:57Z)
Multi-fidelity Stability for Graph Representation Learning [38.31487722188051]
We introduce a weaker uniform generalization termed emphmulti-fidelity stability and give an example. We present lower bounds for the discrepancy between the two types of stability, which justified the multi-fidelity design.
arXiv Detail & Related papers (2021-11-25T01:33:41Z)
On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes. We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z)
Stability of Neural Networks on Manifolds to Relative Perturbations [118.84154142918214]
Graph Neural Networks (GNNs) show impressive performance in many practical scenarios. GNNs can scale well on large size graphs, but this is contradicted by the fact that existing stability bounds grow with the number of nodes.
arXiv Detail & Related papers (2021-10-10T04:37:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.