Statistical and computational thresholds for the planted $k$-densest
sub-hypergraph problem
- URL: http://arxiv.org/abs/2011.11500v2
- Date: Thu, 28 Jan 2021 13:21:05 GMT
- Title: Statistical and computational thresholds for the planted $k$-densest
sub-hypergraph problem
- Authors: Luca Corinzia and Paolo Penna and Wojciech Szpankowski and Joachim M.
Buhmann
- Abstract summary: We consider the problem of recovery a planted $k$-densest sub-hypergraph on $d$-uniform hypergraphs.
This fundamental problem appears in different contexts, e.g., community detection, average-case complexity, and neuroscience applications.
- Score: 13.808053718325635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we consider the problem of recovery a planted $k$-densest
sub-hypergraph on $d$-uniform hypergraphs. This fundamental problem appears in
different contexts, e.g., community detection, average-case complexity, and
neuroscience applications as a structural variant of tensor-PCA problem. We
provide tight \emph{information-theoretic} upper and lower bounds for the exact
recovery threshold by the maximum-likelihood estimator, as well as
\emph{algorithmic} bounds based on approximate message passing algorithms. The
problem exhibits a typical statistical-to-computational gap observed in
analogous sparse settings that widen with increasing sparsity of the problem.
The bounds show that the signal structure impacts the location of the
statistical and computational phase transition that the known existing bounds
for the tensor-PCA model do not capture. This effect is due to the generic
planted signal prior that this latter model addresses.
Related papers
- High-dimensional optimization for multi-spiked tensor PCA [8.435118770300999]
We study the dynamics of two local optimization algorithms, online gradient descent (SGD) and gradient flow.
For gradient flow, we show that the algorithmic threshold to efficiently recover the first spike is also of order $Np-2$.
Our results are obtained through a detailed analysis of a low-dimensional system that describes the evolution of the correlations between the estimators and the spikes.
arXiv Detail & Related papers (2024-08-12T12:09:25Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Information-Theoretic Thresholds for Planted Dense Cycles [52.076657911275525]
We study a random graph model for small-world networks which are ubiquitous in social and biological sciences.
For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds in terms of $n$, $tau$, and an edge-wise signal-to-noise ratio $lambda$.
arXiv Detail & Related papers (2024-02-01T03:39:01Z) - Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem.
We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z) - Subspace clustering in high-dimensions: Phase transitions \&
Statistical-to-Computational gap [24.073221004661427]
A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model.
We provide an exact characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity.
arXiv Detail & Related papers (2022-05-26T17:47:35Z) - Statistical-Computational Trade-offs in Tensor PCA and Related Problems
via Communication Complexity [19.939448881052453]
This paper derives computational lower bounds on the run-time of memory bounded algorithms for PCA using communication complexity.
While the lower bounds do not rule out iteration-time algorithms, they do imply that many commonly-used algorithms, such as gradient descent and power method, must have a higher count when the sample size is not large enough.
arXiv Detail & Related papers (2022-04-15T15:59:43Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Lattice partition recovery with dyadic CART [79.96359947166592]
We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice.
Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature.
In this paper we consider instead the problem of partition recovery, i.e.of estimating the partition of the lattice induced by the constancy regions of the unknown signal.
We prove that a DCART-based procedure consistently estimates the underlying partition at a rate of order $sigma2 k*
arXiv Detail & Related papers (2021-05-27T23:41:01Z) - Computational Barriers to Estimation from Low-Degree Polynomials [81.67886161671379]
We study the power of low-degrees for the task of detecting the presence of hidden structures.
For a large class of "signal plus noise" problems, we give a user-friendly lower bound for the best possible mean squared error achievable by any degree.
As applications, we give a tight characterization of the low-degree minimum mean squared error for the planted submatrix and planted dense subgraph problems.
arXiv Detail & Related papers (2020-08-05T17:52:10Z) - Tensor Clustering with Planted Structures: Statistical Optimality and
Computational Limits [9.427635404752936]
We focus on two clustering models, constant high-order clustering (CHC) and rank-one higher-order clustering (ROHC)
We identify the boundaries of signal-to-noise ratio for which CHC and ROHC detection/recovery are statistically possible.
We propose algorithms that achieve reliable detection and recovery when the signalto-noise ratio is above these thresholds.
arXiv Detail & Related papers (2020-05-21T15:53:44Z) - An Optimal Statistical and Computational Framework for Generalized
Tensor Estimation [10.899518267165666]
This paper describes a flexible framework for low-rank tensor estimation problems.
It includes many important instances from applications in computational imaging, genomics, and network analysis.
arXiv Detail & Related papers (2020-02-26T01:54:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.