SIMPLE-RC: Group Network Inference with Non-Sharp Nulls and Weak Signals
- URL: http://arxiv.org/abs/2211.00128v1
- Date: Mon, 31 Oct 2022 20:36:24 GMT
- Title: SIMPLE-RC: Group Network Inference with Non-Sharp Nulls and Weak Signals
- Authors: Jianqing Fan, Yingying Fan, Jinchi Lv, Fan Yang
- Abstract summary: We propose a SIMPLE method with random coupling (SIMPLE-RC) for testing the non-sharp null hypothesis.
We construct our test as the maximum of the SIMPLE tests for subsampled node pairs from the group.
New theoretical developments are empowered by a second-order expansion of spiked eigenvectors under the $ell_infty$-norm.
- Score: 8.948321043168455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale network inference with uncertainty quantification has important
applications in natural, social, and medical sciences. The recent work of Fan,
Fan, Han and Lv (2022) introduced a general framework of statistical inference
on membership profiles in large networks (SIMPLE) for testing the sharp null
hypothesis that a pair of given nodes share the same membership profiles. In
real applications, there are often groups of nodes under investigation that may
share similar membership profiles at the presence of relatively weaker signals
than the setting considered in SIMPLE. To address these practical challenges,
in this paper we propose a SIMPLE method with random coupling (SIMPLE-RC) for
testing the non-sharp null hypothesis that a group of given nodes share similar
(not necessarily identical) membership profiles under weaker signals. Utilizing
the idea of random coupling, we construct our test as the maximum of the SIMPLE
tests for subsampled node pairs from the group. Such technique reduces
significantly the correlation among individual SIMPLE tests while largely
maintaining the power, enabling delicate analysis on the asymptotic
distributions of the SIMPLE-RC test. Our method and theory cover both the cases
with and without node degree heterogeneity. These new theoretical developments
are empowered by a second-order expansion of spiked eigenvectors under the
$\ell_\infty$-norm, built upon our work for random matrices with weak spikes.
Our theoretical results and the practical advantages of the newly suggested
method are demonstrated through several simulation and real data examples.
Related papers
- Network two-sample test for block models [16.597465729143813]
We consider the two-sample testing problem for networks, where the goal is to determine whether two sets of networks originated from the same model.
We adopt the block model (SBM) for network distributions, due to their interpretability and the potential to approximate more general models.
We introduce an efficient algorithm to match estimated network parameters, allowing us to properly combine and contrast information within and across samples, leading to a powerful test.
arXiv Detail & Related papers (2024-06-10T04:28:37Z) - Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples [53.95282502030541]
Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples.
We try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view.
arXiv Detail & Related papers (2024-06-06T10:38:01Z) - An Efficient Quasi-Random Sampling for Copulas [3.400056739248712]
This paper proposes the use of generative models, such as Generative Adrial Networks (GANs), to generate quasi-random samples for any copula.
GANs are a type of implicit generative models used to learn the distribution of complex data, thus facilitating easy sampling.
arXiv Detail & Related papers (2024-03-08T13:01:09Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Instability and Local Minima in GAN Training with Kernel Discriminators [20.362912591032636]
Generative Adversarial Networks (GANs) are a widely-used tool for generative modeling of complex data.
Despite their empirical success, the training of GANs is not fully understood due to the min-max optimization of the generator and discriminator.
This paper analyzes these joint dynamics when the true samples, as well as the generated samples, are discrete, finite sets, and the discriminator is kernel-based.
arXiv Detail & Related papers (2022-08-21T18:03:06Z) - AdaPT-GMM: Powerful and robust covariate-assisted multiple testing [0.7614628596146599]
We propose a new empirical Bayes method for co-assisted multiple testing with false discovery rate (FDR) control.
Our method refines the adaptive p-value thresholding (AdaPT) procedure by generalizing its masking scheme.
We show in extensive simulations and real data examples that our new method, which we call AdaPT-GMM, consistently delivers high power.
arXiv Detail & Related papers (2021-06-30T05:06:18Z) - Double Generative Adversarial Networks for Conditional Independence
Testing [8.359770027722275]
High-dimensional conditional independence testing is a key building block in statistics and machine learning.
We propose an inferential procedure based on double generative adversarial networks (GANs)
arXiv Detail & Related papers (2020-06-03T16:14:15Z) - Lower bounds in multiple testing: A framework based on derandomized
proxies [107.69746750639584]
This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
arXiv Detail & Related papers (2020-05-07T19:59:51Z) - Noisy Adaptive Group Testing using Bayesian Sequential Experimental
Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually.
Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.