Sobolev Approximation of Deep ReLU Network in Log-weighted Barron Space
- URL: http://arxiv.org/abs/2601.01295v1
- Date: Sat, 03 Jan 2026 22:03:19 GMT
- Title: Sobolev Approximation of Deep ReLU Network in Log-weighted Barron Space
- Authors: Changhoon Song, Seungchan Ko, Youngjoon Hong,
- Abstract summary: We introduce a log-weighted Barron space $mathscrBlog$.<n>We prove that functions in $mathscrBlog$ can be approximated by deep ReLU networks with explicit depth dependence.
- Score: 8.36932842016193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Universal approximation theorems show that neural networks can approximate any continuous function; however, the number of parameters may grow exponentially with the ambient dimension, so these results do not fully explain the practical success of deep models on high-dimensional data. Barron space theory addresses this: if a target function belongs to a Barron space, a two-layer network with $n$ parameters achieves an $O(n^{-1/2})$ approximation error in $L^2$. Yet classical Barron spaces $\mathscr{B}^{s+1}$ still require stronger regularity than Sobolev spaces $H^s$, and existing depth-sensitive results often assume constraints such as $sL \le 1/2$. In this paper, we introduce a log-weighted Barron space $\mathscr{B}^{\log}$, which requires a strictly weaker assumption than $\mathscr{B}^s$ for any $s>0$. For this new function space, we first study embedding properties and carry out a statistical analysis via the Rademacher complexity. Then we prove that functions in $\mathscr{B}^{\log}$ can be approximated by deep ReLU networks with explicit depth dependence. We then define a family $\mathscr{B}^{s,\log}$, establish approximation bounds in the $H^1$ norm, and identify maximal depth scales under which these rates are preserved. Our results clarify how depth reduces regularity requirements for efficient representation, offering a more precise explanation for the performance of deep architectures beyond the classical Barron setting, and for their stable use in high-dimensional problems used today.
Related papers
- High-accuracy sampling for diffusion models and log-concave distributions [70.90863485771405]
We present algorithms for diffusion model sampling which obtain $$-error in $mathrmpolylog (1/)$ steps.<n>Our approach also yields the first $mathrmpolylog (1/)$ complexity sampler for general log-concave distributions.
arXiv Detail & Related papers (2026-02-01T17:05:31Z) - Nonlocal techniques for the analysis of deep ReLU neural network approximations [0.0]
Recently, Daubechies, DeVore, Foucart, Hanin, and Petrova introduced a system of piece-wise linear functions.<n>We show that this system serves as a Riesz basis also for Sobolev spaces and Barron classes.
arXiv Detail & Related papers (2025-04-07T09:00:22Z) - On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks [2.4112990554464235]
We show that the rate $mathcalO((WL)-2s/d)$ indeed holds under the Sobolev embedding condition.<n>Key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth.
arXiv Detail & Related papers (2024-09-02T02:26:01Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$<n>We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces [2.7195102129095003]
Deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $Ws(L_q(Omega))$ and Besov spaces $Bs_r(L_q(Omega))$.
This problem is important when studying the application of neural networks in a variety of fields.
arXiv Detail & Related papers (2022-11-25T23:32:26Z) - Understanding Deep Neural Function Approximation in Reinforcement
Learning via $\epsilon$-Greedy Exploration [53.90873926758026]
This paper provides a theoretical study of deep neural function approximation in reinforcement learning (RL)
We focus on the value based algorithm with the $epsilon$-greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces.
Our analysis reformulates the temporal difference error in an $L2(mathrmdmu)$-integrable space over a certain averaged measure $mu$, and transforms it to a generalization problem under the non-iid setting.
arXiv Detail & Related papers (2022-09-15T15:42:47Z) - Low-degree learning and the metric entropy of polynomials [44.99833362998488]
We prove that any (deterministic or randomized) algorithm which learns $mathscrF_nd$ with $L$-accuracy $varepsilon$ requires at least $Omega(sqrtvarepsilon)2dlog n leq log mathsfM(mathscrF_n,d,|cdot|_L,varepsilon) satisfies the two-sided estimate $$c (1-varepsilon)2dlog
arXiv Detail & Related papers (2022-03-17T23:52:08Z) - Optimal Approximation Rates and Metric Entropy of ReLU$^k$ and Cosine
Networks [0.0]
We show that the largest Banach space of functions which can be efficiently approximated by the corresponding shallow neural networks is the space whose norm is given by the gauge of the closed convex hull of the set $pmsigma(omegacdot x + b)$.
We establish the precises of the $L2$-metric entropy of the unit ball of these guage spaces and, as a consequence, the optimal approximation rates for shallow ReLU$k$ networks.
arXiv Detail & Related papers (2021-01-29T02:29:48Z) - A deep network construction that adapts to intrinsic dimensionality
beyond the domain [79.23797234241471]
We study the approximation of two-layer compositions $f(x) = g(phi(x))$ via deep networks with ReLU activation.
We focus on two intuitive and practically relevant choices for $phi$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets.
arXiv Detail & Related papers (2020-08-06T09:50:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.