Covering Number of Real Algebraic Varieties and Beyond: Improved Bounds and Applications
- URL: http://arxiv.org/abs/2311.05116v4
- Date: Fri, 06 Jun 2025 17:56:11 GMT
- Title: Covering Number of Real Algebraic Varieties and Beyond: Improved Bounds and Applications
- Authors: Yifan Zhang, Joe Kileel,
- Abstract summary: We prove upper bounds on the covering number of numerous sets in Euclidean space.<n>We illustrate the power of the result on three computational applications.
- Score: 8.438718130535296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Covering numbers are a powerful tool used in the development of approximation algorithms, randomized dimension reduction methods, smoothed complexity analysis, and others. In this paper we prove upper bounds on the covering number of numerous sets in Euclidean space, namely real algebraic varieties, images of polynomial maps and semialgebraic sets in terms of the number of variables and degrees of the polynomials involved. The bounds remarkably improve the best known general bound by Yomdin-Comte, and our proof is much more straightforward. In particular, our result gives new bounds on the volume of the tubular neighborhood of the image of a polynomial map and a semialgebraic set, where results for varieties by Lotz and Basu-Lerario are not directly applicable. We illustrate the power of the result on three computational applications. Firstly, we derive a near-optimal bound on the covering number of tensors with low canonical polyadic (CP) rank, quantifying their approximation properties and filling in an important missing piece of theory for tensor dimension reduction and reconstruction. Secondly, we prove a bound on dimensionality reduction of images of polynomial maps via randomized sketching, which has direct applications to large scale polynomial optimization. Finally, we deduce generalization error bounds for deep neural networks with rational or ReLU activation functions, improving or matching the best known results in the machine learning literature while helping to quantify the impact of architecture choice on generalization error.
Related papers
- On Uniform Weighted Deep Polynomial approximation [0.0]
We introduce and analyze a class of weighted deep approximants tailored for functions with asymmetric behavior-growing on one side and decaying on the other.<n>We show numerically that this framework outperforms Taylor, Chebyshev, and standard deep approximants, even when all use the same number of parameters.
arXiv Detail & Related papers (2025-06-26T14:25:32Z) - Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks [17.487761710665968]
We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows.
arXiv Detail & Related papers (2025-01-10T12:52:00Z) - Simple Multigraph Convolution Networks [49.19906483875984]
Existing multigraph convolution methods either ignore the cross-view interaction among multiple graphs, or induce extremely high computational cost due to standard cross-view operators.
This paper proposes a Simple Multi Convolution Networks (SMGCN) which first extracts consistent cross-view topology from multigraphs including edge-level and subgraph-level topology, then performs expansion based on raw multigraphs and consistent topologies.
In theory, SMGCN utilizes the consistent topologies in expansion rather than standard cross-view expansion, which performs credible cross-view spatial message-passing, and effectively reduces the complexity of standard expansion.
arXiv Detail & Related papers (2024-03-08T03:27:58Z) - Polynomial-Time Solutions for ReLU Network Training: A Complexity
Classification via Max-Cut and Zonotopes [70.52097560486683]
We prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.
In particular, when $epsilonleqsqrt84/83-1approx 0.006$, we show that it is NP-hard to find an approximate global dataset of the ReLU network objective with relative error $epsilon$ with respect to the objective value.
arXiv Detail & Related papers (2023-11-18T04:41:07Z) - Revisiting Tropical Polynomial Division: Theory, Algorithms and
Application to Neural Networks [40.137069931650444]
Tropical geometry has recently found several applications in the analysis of neural networks with piecewise linear activation functions.
This paper presents a new look at the problem of tropical division and its application to the simplification of neural networks.
arXiv Detail & Related papers (2023-06-27T02:26:07Z) - GloptiNets: Scalable Non-Convex Optimization with Certificates [61.50835040805378]
We present a novel approach to non-cube optimization with certificates, which handles smooth functions on the hypercube or on the torus.
By exploiting the regularity of the target function intrinsic in the decay of its spectrum, we allow at the same time to obtain precise certificates and leverage the advanced and powerful neural networks.
arXiv Detail & Related papers (2023-06-26T09:42:59Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - State polynomials: positivity, optimization and nonlinear Bell
inequalities [3.9692590090301683]
This paper introduces states in noncommuting variables and formal states of their products.
It shows that states, positive over all and matricial states, are sums of squares with denominators.
It is also established that avinetengle Kritivsatz fails to hold in the state setting.
arXiv Detail & Related papers (2023-01-29T18:52:21Z) - Sparse Polynomial Optimization: Theory and Practice [5.27013884159732]
Book presents several efforts to tackle this challenge with important scientific implications.
It provides alternative optimization schemes that scale well in terms of computational complexity.
We present sparsity-exploiting hierarchies of relaxations, for either unconstrained or constrained problems.
arXiv Detail & Related papers (2022-08-23T18:56:05Z) - Sum-of-Squares Relaxations for Information Theory and Variational
Inference [0.0]
We consider extensions of the Shannon relative entropy, referred to as $f$-divergences.
We derive a sequence of convex relaxations for computing these divergences.
We provide more efficient relaxations based on spectral information divergences from quantum information theory.
arXiv Detail & Related papers (2022-06-27T13:22:40Z) - Optimal Gradient-based Algorithms for Non-concave Bandit Optimization [76.57464214864756]
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave.
Our algorithms are based on a unified zeroth-order optimization paradigm that applies in great generality.
We show that the standard optimistic algorithms are sub-optimal by dimension factors.
arXiv Detail & Related papers (2021-07-09T16:04:24Z) - A block-sparse Tensor Train Format for sample-efficient high-dimensional
Polynomial Regression [0.0]
Low-rank tensors are an established framework for high-dimensionals problems.
We propose to extend this framework by including the concept of block-sparsity.
This allows us to adapt the ansatz space to align better with known sample results.
arXiv Detail & Related papers (2021-04-29T10:57:53Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Curvature-Dependant Global Convergence Rates for Optimization on
Manifolds of Bounded Geometry [6.85316573653194]
We give curvature-dependant convergence rates for weakly convex functions defined on a manifold of 1-bounded geometry.
We compute these bounds explicitly for some manifold commonly used in the optimization literature.
We present self-contained proofs of fully general bounds on the norm of the differential of the exponential map.
arXiv Detail & Related papers (2020-08-06T08:30:35Z) - AMITE: A Novel Polynomial Expansion for Analyzing Neural Network
Nonlinearities [1.8761314918771685]
Polynomial expansions are important in the analysis of neural network nonlinearities.
Existing approaches span classical Taylor and Chebyshev methods.
There are no approaches that provide a consistent method an expansion with all these properties.
arXiv Detail & Related papers (2020-07-13T07:58:47Z) - Inexact and Stochastic Generalized Conditional Gradient with Augmented
Lagrangian and Proximal Step [2.0196229393131726]
We analyze inexact and versions of the CGALP algorithm developed in the authors' previous paper.
This allows one to compute some gradients, terms, and/or linear minimization oracles in an inexact fashion.
We show convergence of the Lagrangian to an optimum and feasibility of the affine constraint.
arXiv Detail & Related papers (2020-05-11T14:52:16Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - A refinement of Reznick's Positivstellensatz with applications to
quantum information theory [72.8349503901712]
In Hilbert's 17th problem Artin showed that any positive definite in several variables can be written as the quotient of two sums of squares.
Reznick showed that the denominator in Artin's result can always be chosen as an $N$-th power of the squared norm of the variables.
arXiv Detail & Related papers (2019-09-04T11:46:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.