Reducibility and Statistical-Computational Gaps from Secret Leakage
- URL: http://arxiv.org/abs/2005.08099v2
- Date: Sun, 28 Jun 2020 21:59:41 GMT
- Title: Reducibility and Statistical-Computational Gaps from Secret Leakage
- Authors: Matthew Brennan, Guy Bresler
- Abstract summary: We show that a slight generalization of the planted clique conjecture -- secret leakage planted clique -- gives rise to a variety of new average-case reduction techniques.
Using variants of the planted clique conjecture for specific forms of secret leakage planted clique, we deduce tight statistical-computational tradeoffs.
- Score: 19.25775832101647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inference problems with conjectured statistical-computational gaps are
ubiquitous throughout modern statistics, computer science and statistical
physics. While there has been success evidencing these gaps from the failure of
restricted classes of algorithms, progress towards a more traditional
reduction-based approach to computational complexity in statistical inference
has been limited. Existing reductions have largely been limited to inference
problems with similar structure -- primarily mapping among problems
representable as a sparse submatrix signal plus a noise matrix, which are
similar to the common hardness assumption of planted clique.
The insight in this work is that a slight generalization of the planted
clique conjecture -- secret leakage planted clique -- gives rise to a variety
of new average-case reduction techniques, yielding a web of reductions among
problems with very different structure. Using variants of the planted clique
conjecture for specific forms of secret leakage planted clique, we deduce tight
statistical-computational tradeoffs for a diverse range of problems including
robust sparse mean estimation, mixtures of sparse linear regressions, robust
sparse linear regression, tensor PCA, variants of dense $k$-block stochastic
block models, negatively correlated sparse PCA, semirandom planted dense
subgraph, detection in hidden partition models and a universality principle for
learning sparse mixtures. In particular, a $k$-partite hypergraph variant of
the planted clique conjecture is sufficient to establish all of our
computational lower bounds. Our techniques also reveal novel connections to
combinatorial designs and to random matrix theory. This work gives the first
evidence that an expanded set of hardness assumptions, such as for secret
leakage planted clique, may be a key first step towards a more complete theory
of reductions among statistical problems.
Related papers
- Tensor cumulants for statistical inference on invariant distributions [49.80012009682584]
We show that PCA becomes computationally hard at a critical value of the signal's magnitude.
We define a new set of objects, which provide an explicit, near-orthogonal basis for invariants of a given degree.
It also lets us analyze a new problem of distinguishing between different ensembles.
arXiv Detail & Related papers (2024-04-29T14:33:24Z) - Fine-grained analysis of non-parametric estimation for pairwise learning [9.676007573960383]
We are concerned with the generalization performance of non-parametric estimation for pairwise learning.
Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC, pairwise regression and metric and similarity learning.
arXiv Detail & Related papers (2023-05-31T08:13:14Z) - Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean
Estimation [8.9598796481325]
We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems.
We establish an information-computation gap for private sparse mean estimation.
We also give evidence for privacy-induced information-computation gaps for several other statistics and learning problems.
arXiv Detail & Related papers (2022-11-01T20:03:41Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Average-Case Communication Complexity of Statistical Problems [48.03761288397421]
Communication complexity is the main tool for proving lower bounds in streaming, sketching, and query-based models.
We provide a general reduction method that preserves the input distribution for problems involving a random graph or matrix with planted structure.
We derive two-party and multi-party communication lower bounds for detecting or finding planted cliques.
arXiv Detail & Related papers (2021-07-03T03:31:37Z) - Probabilistic Simplex Component Analysis [66.30587591100566]
PRISM is a probabilistic simplex component analysis approach to identifying the vertices of a data-circumscribing simplex from data.
The problem has a rich variety of applications, the most notable being hyperspectral unmixing in remote sensing and non-negative matrix factorization in machine learning.
arXiv Detail & Related papers (2021-03-18T05:39:00Z) - Statistical and computational thresholds for the planted $k$-densest
sub-hypergraph problem [13.808053718325635]
We consider the problem of recovery a planted $k$-densest sub-hypergraph on $d$-uniform hypergraphs.
This fundamental problem appears in different contexts, e.g., community detection, average-case complexity, and neuroscience applications.
arXiv Detail & Related papers (2020-11-23T16:02:12Z) - Learning, compression, and leakage: Minimising classification error via
meta-universal compression principles [87.054014983402]
A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding.
Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains PAC learning when applied to a wide variety of models.
We show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.
arXiv Detail & Related papers (2020-10-14T20:03:58Z) - Computational Barriers to Estimation from Low-Degree Polynomials [81.67886161671379]
We study the power of low-degrees for the task of detecting the presence of hidden structures.
For a large class of "signal plus noise" problems, we give a user-friendly lower bound for the best possible mean squared error achievable by any degree.
As applications, we give a tight characterization of the low-degree minimum mean squared error for the planted submatrix and planted dense subgraph problems.
arXiv Detail & Related papers (2020-08-05T17:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.