Geometric separation and constructive universal approximation with two hidden layers
- URL: http://arxiv.org/abs/2602.12482v1
- Date: Thu, 12 Feb 2026 23:46:11 GMT
- Title: Geometric separation and constructive universal approximation with two hidden layers
- Authors: Chanyoung Sung,
- Abstract summary: We show that networks with two hidden layers and either a sigmoidal activation (i.e., strictly monotone bounded continuous) or the ReLU activation can approximate any real-valued continuous function.<n>For finite $K$, the construction simplifies and yields a sharp depth-2 (single hidden layer) approximation result.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give a geometric construction of neural networks that separate disjoint compact subsets of $\Bbb R^n$, and use it to obtain a constructive universal approximation theorem. Specifically, we show that networks with two hidden layers and either a sigmoidal activation (i.e., strictly monotone bounded continuous) or the ReLU activation can approximate any real-valued continuous function on an arbitrary compact set $K\subset\Bbb R^n$ to any prescribed accuracy in the uniform norm. For finite $K$, the construction simplifies and yields a sharp depth-2 (single hidden layer) approximation result.
Related papers
- Constructive counterexamples to the additivity of minimum output Rényi entropy of quantum channels for all $p>1$ [0.29465623430708904]
We present explicit quantum channels with strictly sub-additive minimum output R'enyi entropy for all $p>1$.<n>Our example is provided by explicit constructions of linear subspaces with high geometric measure of entanglement.
arXiv Detail & Related papers (2025-10-08T21:02:55Z) - Constructive Universal Approximation and Finite Sample Memorization by Narrow Deep ReLU Networks [0.0]
We show that any dataset with $N$ distinct points in $mathbbRd$ and $M$ output classes can be exactly classified.<n>We also prove a universal approximation theorem in $Lp(Omega; mathbbRm)$ for any bounded domain.<n>Our results offer a unified and interpretable framework connecting controllability, expressivity, and training dynamics in deep neural networks.
arXiv Detail & Related papers (2024-09-10T14:31:21Z) - Deep Ridgelet Transform and Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines [15.67299102925013]
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps.<n>Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
arXiv Detail & Related papers (2024-05-22T14:25:02Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Polynomial Width is Sufficient for Set Representation with
High-dimensional Features [69.65698500919869]
DeepSets is the most widely used neural network architecture for set representation.
We present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE)
arXiv Detail & Related papers (2023-07-08T16:00:59Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm.
We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees.
We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Quantitative Rates and Fundamental Obstructions to Non-Euclidean
Universal Approximation with Deep Narrow Feed-Forward Networks [3.8073142980733]
We quantify the number of narrow layers required for "deep geometric feed-forward neural networks"
We find that both the global and local universal approximation guarantees can only coincide when approximating null-homotopic functions.
arXiv Detail & Related papers (2021-01-13T23:29:40Z) - Universal Approximation Property of Neural Ordinary Differential
Equations [19.861764482790544]
We show that NODEs can form an $Lp$-universal approximator for continuous maps under certain conditions.
We also show their stronger approximation property, namely the $sup$-universality for approximating a large class of diffeomorphisms.
arXiv Detail & Related papers (2020-12-04T05:53:21Z) - Universal Approximation Power of Deep Residual Neural Networks via
Nonlinear Control Theory [9.210074587720172]
We explain the universal approximation capabilities of deep residual neural networks through geometric nonlinear control.
Inspired by recent work establishing links between residual networks and control systems, we provide a general sufficient condition for a residual network to have the power of universal approximation.
arXiv Detail & Related papers (2020-07-12T14:53:30Z) - Minimum Width for Universal Approximation [91.02689252671291]
We prove that the minimum width required for the universal approximation of the $Lp$ functions is exactly $maxd_x+1,d_y$.
We also prove that the same conclusion does not hold for the uniform approximation with ReLU, but does hold with an additional threshold activation function.
arXiv Detail & Related papers (2020-06-16T01:24:21Z) - Revealing the Structure of Deep Neural Networks via Convex Duality [70.15611146583068]
We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of hidden layers.
We show that a set of optimal hidden layer weights for a norm regularized training problem can be explicitly found as the extreme points of a convex set.
We apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds.
arXiv Detail & Related papers (2020-02-22T21:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.