Network two-sample test for block models
- URL: http://arxiv.org/abs/2406.06014v1
- Date: Mon, 10 Jun 2024 04:28:37 GMT
- Title: Network two-sample test for block models
- Authors: Chung Kyong Nguen, Oscar Hernan Madrid Padilla, Arash A. Amini,
- Abstract summary: We consider the two-sample testing problem for networks, where the goal is to determine whether two sets of networks originated from the same model.
We adopt the block model (SBM) for network distributions, due to their interpretability and the potential to approximate more general models.
We introduce an efficient algorithm to match estimated network parameters, allowing us to properly combine and contrast information within and across samples, leading to a powerful test.
- Score: 16.597465729143813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the two-sample testing problem for networks, where the goal is to determine whether two sets of networks originated from the same stochastic model. Assuming no vertex correspondence and allowing for different numbers of nodes, we address a fundamental network testing problem that goes beyond simple adjacency matrix comparisons. We adopt the stochastic block model (SBM) for network distributions, due to their interpretability and the potential to approximate more general models. The lack of meaningful node labels and vertex correspondence translate to a graph matching challenge when developing a test for SBMs. We introduce an efficient algorithm to match estimated network parameters, allowing us to properly combine and contrast information within and across samples, leading to a powerful test. We show that the matching algorithm, and the overall test are consistent, under mild conditions on the sparsity of the networks and the sample sizes, and derive a chi-squared asymptotic null distribution for the test. Through a mixture of theoretical insights and empirical validations, including experiments with both synthetic and real-world data, this study advances robust statistical inference for complex network data.
Related papers
- Valid Bootstraps for Networks with Applications to Network Visualisation [0.0]
Quantifying uncertainty in networks is an important step in modelling relationships and interactions between entities.
We consider the challenge of bootstrapping an inhomogeneous random graph when only a single observation of the network is made.
We propose a principled, novel, distribution-free network bootstrap using k-nearest neighbour smoothing.
arXiv Detail & Related papers (2024-10-28T10:22:22Z) - Training Guarantees of Neural Network Classification Two-Sample Tests by Kernel Analysis [58.435336033383145]
We construct and analyze a neural network two-sample test to determine whether two datasets came from the same distribution.
We derive the theoretical minimum training time needed to ensure the NTK two-sample test detects a deviation-level between the datasets.
We show that the statistical power associated with the neural network two-sample test goes to 1 as the neural network training samples and test evaluation samples go to infinity.
arXiv Detail & Related papers (2024-07-05T18:41:16Z) - Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Nested stochastic block model for simultaneously clustering networks and
nodes [9.860884833526407]
We introduce the nested block model (NSBM) to cluster a collection of networks while simultaneously detecting communities within each network.
NSBM has several appealing features including the ability to work on unlabeled networks with potentially different node sets.
arXiv Detail & Related papers (2023-07-18T12:46:34Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Lost in the Shuffle: Testing Power in the Presence of Errorful Network Vertex Labels [2.8406702588667807]
Two-sample network hypothesis testing is an important inference task with applications across diverse fields such as medicine, neuroscience, and sociology.
Many of these testing methodologies operate under the implicit assumption that the correspondence across networks is a priori known.
This power loss due to shuffling is theoretically explored in the context of random dot product and block model networks.
arXiv Detail & Related papers (2022-08-18T05:27:18Z) - Semiparametric Bayesian Networks [5.205440005969871]
We introduce semiparametric Bayesian networks that combine parametric and nonparametric conditional probability distributions.
Their aim is to incorporate the bounded complexity of parametric models and the flexibility of nonparametric ones.
arXiv Detail & Related papers (2021-09-07T11:47:32Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Group Testing with a Graph Infection Spread Model [61.48558770435175]
Infection spreads via connections between individuals and this results in a probabilistic cluster formation structure as well as a non-i.i.d. infection status for individuals.
We propose a class of two-step sampled group testing algorithms where we exploit the known probabilistic infection spread model.
Our results imply that, by exploiting information on the connections of individuals, group testing can be used to reduce the number of required tests significantly even when infection rate is high.
arXiv Detail & Related papers (2021-01-14T18:51:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.