Statistical limits of correlation detection in trees
- URL: http://arxiv.org/abs/2209.13723v2
- Date: Wed, 8 Nov 2023 13:53:05 GMT
- Title: Statistical limits of correlation detection in trees
- Authors: Luca Ganassali, Laurent Massouli\'e, Guilhem Semerjian
- Abstract summary: This paper addresses the problem of testing whether two observed trees $(t,t')$ are sampled independently or from a joint distribution under which they are correlated.
Motivated by graph alignment, we investigate the conditions of existence of one-sided tests.
We find that no such test exists for $s leq sqrtalpha$, and that such a test exists whenever $s > sqrtalpha$, for $lambda$ large enough.
- Score: 0.7826806223782055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we address the problem of testing whether two observed trees
$(t,t')$ are sampled either independently or from a joint distribution under
which they are correlated. This problem, which we refer to as correlation
detection in trees, plays a key role in the study of graph alignment for two
correlated random graphs. Motivated by graph alignment, we investigate the
conditions of existence of one-sided tests, i.e. tests which have vanishing
type I error and non-vanishing power in the limit of large tree depth. For the
correlated Galton-Watson model with Poisson offspring of mean $\lambda>0$ and
correlation parameter $s \in (0,1)$, we identify a phase transition in the
limit of large degrees at $s = \sqrt{\alpha}$, where $\alpha \sim 0.3383$ is
Otter's constant. Namely, we prove that no such test exists for $s \leq
\sqrt{\alpha}$, and that such a test exists whenever $s > \sqrt{\alpha}$, for
$\lambda$ large enough. This result sheds new light on the graph alignment
problem in the sparse regime (with $O(1)$ average node degrees) and on the
performance of the MPAlign method studied in Ganassali et al. (2021), Piccioli
et al. (2021), proving in particular the conjecture of Piccioli et al. (2021)
that MPAlign succeeds in the partial recovery task for correlation parameter
$s>\sqrt{\alpha}$ provided the average node degree $\lambda$ is large enough.
As a byproduct, we identify a new family of orthogonal polynomials for the
Poisson-Galton-Watson measure which enjoy remarkable properties. These
polynomials may be of independent interest for a variety of problems involving
graphs, trees or branching processes, beyond the scope of graph alignment.
Related papers
- A computational transition for detecting correlated stochastic block models by low-degree polynomials [13.396246336911842]
Detection of correlation in a pair of random graphs is a fundamental statistical and computational problem that has been extensively studied in recent years.
We consider a pair of correlated block models $mathcalS(n,tfraclambdan;k,epsilon;s)$ that are subsampled from a common parent block model $mathcal S(n,tfraclambdan;k,epsilon;s)$
We focus on tests based on emphlow-degrees of the entries of the adjacency
arXiv Detail & Related papers (2024-09-02T06:14:05Z) - Kernelized Normalizing Constant Estimation: Bridging Bayesian Quadrature
and Bayesian Optimization [51.533164528799084]
We show that to estimate the normalizing constant within a small relative error, the level of difficulty depends on the value of $lambda$.
We find that this pattern holds true even when the function evaluations are noisy.
arXiv Detail & Related papers (2024-01-11T07:45:09Z) - Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem.
We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z) - Detection-Recovery Gap for Planted Dense Cycles [72.4451045270967]
We consider a model where a dense cycle with expected bandwidth $n tau$ and edge density $p$ is planted in an ErdHos-R'enyi graph $G(n,q)$.
We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree algorithms.
arXiv Detail & Related papers (2023-02-13T22:51:07Z) - Planted Bipartite Graph Detection [13.95780443241133]
We consider the task of detecting a hidden bipartite subgraph in a given random graph.
Under the null hypothesis, the graph is a realization of an ErdHosR'enyi random graph over $n$ with edge density $q$.
Under the alternative, there exists a planted $k_mathsfR times k_mathsfL$ bipartite subgraph with edge density $p>q$.
arXiv Detail & Related papers (2023-02-07T18:18:17Z) - Causal Bandits for Linear Structural Equation Models [58.2875460517691]
This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model.
It is assumed that the graph's structure is known and has $N$ nodes.
Two algorithms are proposed for the frequentist (UCB-based) and Bayesian settings.
arXiv Detail & Related papers (2022-08-26T16:21:31Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Testing network correlation efficiently via counting trees [19.165942326142538]
We propose a new procedure for testing whether two networks are edge-correlated through some latent correspondence.
The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees.
arXiv Detail & Related papers (2021-10-22T14:47:20Z) - Inferring Hidden Structures in Random Graphs [13.031167737538881]
We study the two inference problems of detecting and recovering an isolated community of emphgeneral structure planted in a random graph.
We derive lower bounds for detecting/recovering the structure $Gamma_k$ in terms of the parameters $(n,k,q)$, as well as certain properties of $Gamma_k$, and exhibit computationally optimal algorithms that achieve these lower bounds.
arXiv Detail & Related papers (2021-10-05T09:39:51Z) - Correlation detection in trees for partial graph alignment [3.5880535198436156]
We consider alignment of graphs, which consists in finding a mapping between the nodes of two graphs which preserves most of the edges.
Our approach is to compare local structures in the two graphs, matching two nodes if their neighborhoods are 'close enough' for correlated graphs.
This problem can be rephrased in terms of testing whether a pair of branching trees is drawn from either a product distribution, or a correlated distribution.
arXiv Detail & Related papers (2021-07-15T22:02:27Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.