Related papers: Learning Polynomial Transformations

Learning Polynomial Transformations

URL: http://arxiv.org/abs/2204.04209v1
Date: Fri, 8 Apr 2022 17:59:31 GMT
Title: Learning Polynomial Transformations
Authors: Sitan Chen, Jerry Li, Yuanzhi Li, Anru R. Zhang
Abstract summary: We consider the problem of learning high dimensional quadratic transformations of Gaussians. Our results extend to any-invariant input distribution, not just Gaussian. We also give the first decomposition-time algorithms with provable guarantees for tensor ring decomposition.
Score: 41.30404402331856
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of learning high dimensional polynomial transformations of Gaussians. Given samples of the form $p(x)$, where $x\sim N(0, \mathrm{Id}_r)$ is hidden and $p: \mathbb{R}^r \to \mathbb{R}^d$ is a function where every output coordinate is a low-degree polynomial, the goal is to learn the distribution over $p(x)$. This problem is natural in its own right, but is also an important special case of learning deep generative models, namely pushforwards of Gaussians under two-layer neural networks with polynomial activations. Understanding the learnability of such generative models is crucial to understanding why they perform so well in practice. Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting. Our second main result is a polynomial-time algorithm for learning constant-degree polynomial transformations of Gaussian in a smoothed setting, when the rank of the associated tensors is small. In fact our results extend to any rotation-invariant input distribution, not just Gaussian. These are the first end-to-end guarantees for learning a pushforward under a neural network with more than one layer. Along the way, we also give the first polynomial-time algorithms with provable guarantees for tensor ring decomposition, a popular generalization of tensor decomposition that is used in practice to implicitly store large tensors.

Related papers

A simpler Gaussian state-preparation [0.0]
The ability to efficiently state-prepare Gaussian distributions is critical to the success of quantum algorithms.<n>We present a new, more intuitive method which uses exactly $n-1 rotations, $(n-2)(n-2) two-qubit controlled rotations, and $lfloor(n-1)/2rfloor$ ancilla to state-prepare an $n$-qubit Gaussian state.
arXiv Detail & Related papers (2025-08-06T00:40:28Z)
The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models [71.5283441529015]
In this work we consider generic Gaussian Multi-index models, in which the labels only depend on the (Gaussian) $d$-dimensional inputs through their projection onto a low-dimensional $r = O_d(1)$ subspace.<n>We introduce the generative leap exponent $kstar$, a natural extension of the generative exponent from [Damian et al.'24] to the multi-index setting.
arXiv Detail & Related papers (2025-06-05T18:34:56Z)
Learning Mixtures of Gaussians Using Diffusion Models [9.118706387430883]
We give a new algorithm for learning mixtures of $k$ Gaussians to TV error. Our approach is analytic and relies on the framework of diffusion models.
arXiv Detail & Related papers (2024-04-29T17:00:20Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Learning Narrow One-Hidden-Layer ReLU Networks [30.63086568341548]
First-time algorithm succeeds whenever $k$ is a constant. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together.
arXiv Detail & Related papers (2023-04-20T17:53:09Z)
Average-Case Complexity of Tensor Decomposition for Low-Degree Polynomials [93.59919600451487]
"Statistical-computational gaps" occur in many statistical inference tasks. We consider a model for random order-3 decomposition where one component is slightly larger in norm than the rest. We show that tensor entries can accurately estimate the largest component when $ll n3/2$ but fail to do so when $rgg n3/2$.
arXiv Detail & Related papers (2022-11-10T00:40:37Z)
Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations [27.244958998196623]
We present time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations. In particular, we consider learning an unknown network of the form $f(x) = amathsfTsigma(WmathsfTx+b)$, where $x$ is drawn from the Gaussian distribution, and $sigma(t) := max(t,0)$ is the ReLU activation.
arXiv Detail & Related papers (2021-07-21T17:06:03Z)
Reconstruction Algorithms for Low-Rank Tensors and Depth-3 Multilinear Circuits [4.129484350382891]
We give new and efficient black-box reconstruction algorithms for some classes of depth$3$ arithmetic circuits. Our algorithm works over all fields characteristic 0 or large enough characteristic.
arXiv Detail & Related papers (2021-05-04T20:45:07Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z)
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK [58.5766737343951]
We consider the dynamic of descent for learning a two-layer neural network. We show that an over-parametrized two-layer neural network can provably learn with gradient loss at most ground with Tangent samples.
arXiv Detail & Related papers (2020-07-09T07:09:28Z)
Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions. $Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.