Quantum and classical low-degree learning via a dimension-free Remez
inequality
- URL: http://arxiv.org/abs/2301.01438v3
- Date: Mon, 18 Dec 2023 14:53:26 GMT
- Title: Quantum and classical low-degree learning via a dimension-free Remez
inequality
- Authors: Ohad Klein, Joseph Slote, Alexander Volberg, Haonan Zhang
- Abstract summary: We show a new way to relate functions on the hypergrid to their harmonic extensions over the polytorus.
We show the supremum of a function $f$ over products of the cyclic group $exp(2pi i k/K)_k=1K$.
We extend to new spaces a recent line of work citeEI22, CHP, VZ22 that gave similarly efficient methods for learning low-degrees on hypercubes and observables on qubits.
- Score: 52.12931955662553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent efforts in Analysis of Boolean Functions aim to extend core results to
new spaces, including to the slice $\binom{[n]}{k}$, the hypergrid $[K]^n$, and
noncommutative spaces (matrix algebras). We present here a new way to relate
functions on the hypergrid (or products of cyclic groups) to their harmonic
extensions over the polytorus. We show the supremum of a function $f$ over
products of the cyclic group $\{\exp(2\pi i k/K)\}_{k=1}^K$ controls the
supremum of $f$ over the entire polytorus $(\{z\in\mathbf{C}:|z|=1\}^n)$, with
multiplicative constant $C$ depending on $K$ and $\text{deg}(f)$ only. This
Remez-type inequality appears to be the first such estimate that is
dimension-free (i.e., $C$ does not depend on $n$).
This dimension-free Remez-type inequality removes the main technical barrier
to giving $\mathcal{O}(\log n)$ sample complexity, polytime algorithms for
learning low-degree polynomials on the hypergrid and low-degree observables on
level-$K$ qudit systems. In particular, our dimension-free Remez inequality
implies new Bohnenblust--Hille-type estimates which are central to the learning
algorithms and appear unobtainable via standard techniques. Thus we extend to
new spaces a recent line of work \cite{EI22, CHP, VZ22} that gave similarly
efficient methods for learning low-degree polynomials on the hypercube and
observables on qubits.
An additional product of these efforts is a new class of distributions over
which arbitrary quantum observables are well-approximated by their low-degree
truncations -- a phenomenon that greatly extends the reach of low-degree
learning in quantum science \cite{CHP}.
Related papers
- Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - On the Sample Complexity of Two-Layer Networks: Lipschitz vs.
Element-Wise Lipschitz Activation [20.70453775428433]
We investigate the sample complexity of bounded two-layer neural networks using different activation functions.
We prove that if $sigma$ is element-wise, then the sample complexity of $mathcalH$ has only logarithmic dependency in width.
arXiv Detail & Related papers (2022-11-17T16:27:15Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Small Covers for Near-Zero Sets of Polynomials and Learning Latent
Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$.
Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z) - Emergent universality in critical quantum spin chains: entanglement
Virasoro algebra [1.9336815376402714]
Entanglement entropy and entanglement spectrum have been widely used to characterize quantum entanglement in extended many-body systems.
We show that the Schmidt vectors $|v_alpharangle$ display an emergent universal structure, corresponding to a realization of the Virasoro algebra of a boundary CFT.
arXiv Detail & Related papers (2020-09-23T21:22:51Z) - Classical Dynamics from Self-Consistency Equations in Quantum Mechanics
-- Extended Version [0.0]
We propose a new mathematical approach to Bona's non-linear generalization of quantum mechanics.
It highlights the central role of self-consistency.
Some new mathematical concepts are introduced, which are possibly interesting by themselves.
arXiv Detail & Related papers (2020-09-10T16:20:25Z) - Q-learning with Uniformly Bounded Variance: Large Discounting is Not a
Barrier to Fast Learning [7.6146285961466]
We introduce a new class of algorithms that have sample complexity uniformly bounded for all $gamma 1$.
We show that the covariance of the Q-learning algorithm with an optimized step-size sequence is a quadratic function of $1/(1-gamma)$; an expected, and essentially known result.
arXiv Detail & Related papers (2020-02-24T15:12:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.