Covering Number of Real Algebraic Varieties and Beyond: Improved Bounds and Applications
- URL: http://arxiv.org/abs/2311.05116v3
- Date: Sat, 11 May 2024 23:25:04 GMT
- Title: Covering Number of Real Algebraic Varieties and Beyond: Improved Bounds and Applications
- Authors: Yifan Zhang, Joe Kileel,
- Abstract summary: We prove upper bounds on the covering number of sets in Euclidean space.
We show that bounds improve the best known general bound by Yomdin-Comte.
We illustrate the power of the result on three computational applications.
- Score: 8.438718130535296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Covering numbers are a powerful tool used in the development of approximation algorithms, randomized dimension reduction methods, smoothed complexity analysis, and others. In this paper we prove upper bounds on the covering number of numerous sets in Euclidean space, namely real algebraic varieties, images of polynomial maps and semialgebraic sets in terms of the number of variables and degrees of the polynomials involved. The bounds remarkably improve the best known general bound by Yomdin-Comte, and our proof is much more straightforward. In particular, our result gives new bounds on the volume of the tubular neighborhood of the image of a polynomial map and a semialgebraic set, where results for varieties by Lotz and Basu-Lerario are not directly applicable. We illustrate the power of the result on three computational applications. Firstly, we derive a near-optimal bound on the covering number of low rank CP tensors, quantifying their approximation properties and filling in an important missing piece of theory for tensor dimension reduction and reconstruction. Secondly, we prove a bound on the required dimension for the randomized sketching of polynomial optimization problems, which controls how much computation can be saved through randomization without sacrificing solution quality. Finally, we deduce generalization error bounds for deep neural networks with rational or ReLU activation functions, improving or matching the best known results in the machine learning literature while helping to quantify the impact of architecture choice on generalization error.
Related papers
- Polynomial-Time Solutions for ReLU Network Training: A Complexity
Classification via Max-Cut and Zonotopes [70.52097560486683]
We prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.
In particular, when $epsilonleqsqrt84/83-1approx 0.006$, we show that it is NP-hard to find an approximate global dataset of the ReLU network objective with relative error $epsilon$ with respect to the objective value.
arXiv Detail & Related papers (2023-11-18T04:41:07Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Sparse Polynomial Optimization: Theory and Practice [5.27013884159732]
Book presents several efforts to tackle this challenge with important scientific implications.
It provides alternative optimization schemes that scale well in terms of computational complexity.
We present sparsity-exploiting hierarchies of relaxations, for either unconstrained or constrained problems.
arXiv Detail & Related papers (2022-08-23T18:56:05Z) - Sum-of-Squares Relaxations for Information Theory and Variational
Inference [0.0]
We consider extensions of the Shannon relative entropy, referred to as $f$-divergences.
We derive a sequence of convex relaxations for computing these divergences.
We provide more efficient relaxations based on spectral information divergences from quantum information theory.
arXiv Detail & Related papers (2022-06-27T13:22:40Z) - Optimal Gradient-based Algorithms for Non-concave Bandit Optimization [76.57464214864756]
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave.
Our algorithms are based on a unified zeroth-order optimization paradigm that applies in great generality.
We show that the standard optimistic algorithms are sub-optimal by dimension factors.
arXiv Detail & Related papers (2021-07-09T16:04:24Z) - A block-sparse Tensor Train Format for sample-efficient high-dimensional
Polynomial Regression [0.0]
Low-rank tensors are an established framework for high-dimensionals problems.
We propose to extend this framework by including the concept of block-sparsity.
This allows us to adapt the ansatz space to align better with known sample results.
arXiv Detail & Related papers (2021-04-29T10:57:53Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Inexact and Stochastic Generalized Conditional Gradient with Augmented
Lagrangian and Proximal Step [2.0196229393131726]
We analyze inexact and versions of the CGALP algorithm developed in the authors' previous paper.
This allows one to compute some gradients, terms, and/or linear minimization oracles in an inexact fashion.
We show convergence of the Lagrangian to an optimum and feasibility of the affine constraint.
arXiv Detail & Related papers (2020-05-11T14:52:16Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - A refinement of Reznick's Positivstellensatz with applications to
quantum information theory [72.8349503901712]
In Hilbert's 17th problem Artin showed that any positive definite in several variables can be written as the quotient of two sums of squares.
Reznick showed that the denominator in Artin's result can always be chosen as an $N$-th power of the squared norm of the variables.
arXiv Detail & Related papers (2019-09-04T11:46:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.