Testing network correlation efficiently via counting trees
- URL: http://arxiv.org/abs/2110.11816v1
- Date: Fri, 22 Oct 2021 14:47:20 GMT
- Title: Testing network correlation efficiently via counting trees
- Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu
- Abstract summary: We propose a new procedure for testing whether two networks are edge-correlated through some latent correspondence.
The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees.
- Score: 19.165942326142538
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new procedure for testing whether two networks are
edge-correlated through some latent vertex correspondence. The test statistic
is based on counting the co-occurrences of signed trees for a family of
non-isomorphic trees. When the two networks are Erd\H{o}s-R\'enyi random graphs
$\mathcal{G}(n,q)$ that are either independent or correlated with correlation
coefficient $\rho$, our test runs in $n^{2+o(1)}$ time and succeeds with high
probability as $n\to\infty$, provided that $n\min\{q,1-q\} \ge n^{-o(1)}$ and
$\rho^2>\alpha \approx 0.338$, where $\alpha$ is Otter's constant so that the
number of unlabeled trees with $K$ edges grows as $(1/\alpha)^K$. This
significantly improves the prior work in terms of statistical accuracy, running
time, and graph sparsity.
Related papers
- Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem.
We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z) - On the Unlikelihood of D-Separation [69.62839677485087]
We provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist.
For the PC Algorithm, while it is known that its worst-case guarantees fail on non-sparse graphs, we show that the same is true for the average case.
For UniformSGS, while it is known that the running time is exponential for existing edges, we show that in the average case, that is the expected running time for most non-existing edges as well.
arXiv Detail & Related papers (2023-03-10T00:11:18Z) - Detection-Recovery Gap for Planted Dense Cycles [72.4451045270967]
We consider a model where a dense cycle with expected bandwidth $n tau$ and edge density $p$ is planted in an ErdHos-R'enyi graph $G(n,q)$.
We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree algorithms.
arXiv Detail & Related papers (2023-02-13T22:51:07Z) - Phase Transitions in the Detection of Correlated Databases [12.010807505655238]
We study the problem of detecting the correlation between two Gaussian databases $mathsfXinmathbbRntimes d$ and $mathsfYntimes d$, each composed of $n$ users with $d$ features.
This problem is relevant in the analysis of social media, computational biology, etc.
arXiv Detail & Related papers (2023-02-07T10:39:44Z) - Statistical limits of correlation detection in trees [0.7826806223782055]
This paper addresses the problem of testing whether two observed trees $(t,t')$ are sampled independently or from a joint distribution under which they are correlated.
Motivated by graph alignment, we investigate the conditions of existence of one-sided tests.
We find that no such test exists for $s leq sqrtalpha$, and that such a test exists whenever $s > sqrtalpha$, for $lambda$ large enough.
arXiv Detail & Related papers (2022-09-27T22:26:53Z) - Random graph matching at Otter's threshold via counting chandeliers [16.512416293014493]
We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each.
This is the first graph matching algorithm that succeeds at an explicit constant correlation and applies to both sparse and dense graphs.
arXiv Detail & Related papers (2022-09-25T20:00:28Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Coresets for Decision Trees of Signals [19.537354146654845]
We provide the first algorithm that outputs such a $(k,varepsilon)$-coreset for emphevery such matrix $D$.
This is by forging a link between decision trees from machine learning -- to partition trees in computational geometry.
arXiv Detail & Related papers (2021-10-07T05:49:55Z) - Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu [14.298220510927695]
We provide finite sample guarantees for the classical ChowLiu algorithm (IEEE Trans.Inform.Theory, 1968)
We show that for a specific tree $T$, with $widetildeO (|Sigma|2nvarepsilon-1)$ samples from a distribution $P$ over $Sigman$, one can efficiently learn the closest KL divergence.
arXiv Detail & Related papers (2020-11-09T02:08:56Z) - A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms.
We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling"
We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.