Related papers: Beyond Moments: Robustly Learning Affine Transformations with Asymptotically Optimal Error

Beyond Moments: Robustly Learning Affine Transformations with Asymptotically Optimal Error

URL: http://arxiv.org/abs/2302.12289v1
Date: Thu, 23 Feb 2023 19:13:30 GMT
Title: Beyond Moments: Robustly Learning Affine Transformations with Asymptotically Optimal Error
Authors: He Jia, Pravesh K . Kothari, Santosh S. Vempala
Abstract summary: We present a-time algorithm for learning an unknown affine transformation of the standard hypercube from samples. Our algorithm is based on a new method that iteratively improves an estimate of the unknown affine transformation whenever the requirements of the certificate are not met.
Score: 8.615625517708324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a polynomial-time algorithm for robustly learning an unknown affine transformation of the standard hypercube from samples, an important and well-studied setting for independent component analysis (ICA). Specifically, given an $\epsilon$-corrupted sample from a distribution $D$ obtained by applying an unknown affine transformation $x \rightarrow Ax+s$ to the uniform distribution on a $d$-dimensional hypercube $[-1,1]^d$, our algorithm constructs $\hat{A}, \hat{s}$ such that the total variation distance of the distribution $\hat{D}$ from $D$ is $O(\epsilon)$ using poly$(d)$ time and samples. Total variation distance is the information-theoretically strongest possible notion of distance in our setting and our recovery guarantees in this distance are optimal up to the absolute constant factor multiplying $\epsilon$. In particular, if the columns of $A$ are normalized to be unit length, our total variation distance guarantee implies a bound on the sum of the $\ell_2$ distances between the column vectors of $A$ and $A'$, $\sum_{i =1}^d \|a_i-\hat{a}_i\|_2 = O(\epsilon)$. In contrast, the strongest known prior results only yield a $\epsilon^{O(1)}$ (relative) bound on the distance between individual $a_i$'s and their estimates and translate into an $O(d\epsilon)$ bound on the total variation distance. Our key innovation is a new approach to ICA (even to outlier-free ICA) that circumvents the difficulties in the classical method of moments and instead relies on a new geometric certificate of correctness of an affine transformation. Our algorithm is based on a new method that iteratively improves an estimate of the unknown affine transformation whenever the requirements of the certificate are not met.

Related papers

Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality. We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z)
Solving Dense Linear Systems Faster Than via Preconditioning [1.8854491183340518]
We show that our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$. In particular, our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$. Our main algorithm can be viewed as a randomized block coordinate descent method.
arXiv Detail & Related papers (2023-12-14T12:53:34Z)
List-Decodable Covariance Estimation [1.9290392443571387]
We give the first time algorithm for emphlist-decodable covariance estimation. Our result implies the first-time emphexact algorithm for list-decodable linear regression and subspace recovery.
arXiv Detail & Related papers (2022-06-22T09:38:06Z)
Adversarially Robust Learning with Tolerance [8.658596218544774]
We study the problem of tolerant adversarial PAC learning with respect to metric perturbation sets. We show that a variant of the natural perturb-and-smooth algorithm PAC learns any hypothesis class $mathcalH$ with VC dimension $v$ in the $gamma$-tolerant adversarial setting. We additionally propose an alternative learning method which yields sample bounds with only linear dependence on the doubling dimension.
arXiv Detail & Related papers (2022-03-02T03:50:16Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Sampling from Log-Concave Distributions with Infinity-Distance Guarantees and Applications to Differentially Private Optimization [33.38289436686841]
We present an algorithm that outputs a point from a distributionO(varepsilon)$close to $$ in infinity-distance. We also present a "soft-pi" version of the Dikin walk which may be independent interest.
arXiv Detail & Related papers (2021-11-07T13:44:50Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps. DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning [31.552581550603005]
We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Our results are the first efficient distance approximation algorithms for these well-studied problems.
arXiv Detail & Related papers (2020-02-13T07:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.