Inferences on Mixing Probabilities and Ranking in Mixed-Membership
Models
- URL: http://arxiv.org/abs/2308.14988v1
- Date: Tue, 29 Aug 2023 02:35:45 GMT
- Title: Inferences on Mixing Probabilities and Ranking in Mixed-Membership
Models
- Authors: Sohom Bhattacharya, Jianqing Fan, Jikai Hou
- Abstract summary: Network data is prevalent in numerous big data applications including economics and health networks.
In this paper, we model the network using the Degree-Corrected Mixed Membership (DCMM) model.
We derive novel finite-sample expansion for the $boldsymbolpi_i(k)$s which allows us to obtain distributions and confidence interval of the membership mixing probabilities and other related population quantities.
- Score: 5.992878098797828
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Network data is prevalent in numerous big data applications including
economics and health networks where it is of prime importance to understand the
latent structure of network. In this paper, we model the network using the
Degree-Corrected Mixed Membership (DCMM) model. In DCMM model, for each node
$i$, there exists a membership vector $\boldsymbol{\pi}_ i =
(\boldsymbol{\pi}_i(1), \boldsymbol{\pi}_i(2),\ldots, \boldsymbol{\pi}_i(K))$,
where $\boldsymbol{\pi}_i(k)$ denotes the weight that node $i$ puts in
community $k$. We derive novel finite-sample expansion for the
$\boldsymbol{\pi}_i(k)$s which allows us to obtain asymptotic distributions and
confidence interval of the membership mixing probabilities and other related
population quantities. This fills an important gap on uncertainty
quantification on the membership profile. We further develop a ranking scheme
of the vertices based on the membership mixing probabilities on certain
communities and perform relevant statistical inferences. A multiplier bootstrap
method is proposed for ranking inference of individual member's profile with
respect to a given community. The validity of our theoretical results is
further demonstrated by via numerical experiments in both real and synthetic
data examples.
Related papers
- Biology-inspired joint distribution neurons based on Hierarchical Correlation Reconstruction allowing for multidirectional neural networks [0.49728186750345144]
Novel artificial neurons based on HCR (Hierarchical Correlation Reconstruction)
Network can also propagate probability distributions (also joint) like $rho(y,z|x)
arXiv Detail & Related papers (2024-05-08T14:49:27Z) - Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - CARD: Classification and Regression Diffusion Models [51.0421331214229]
We introduce classification and regression diffusion (CARD) models, which combine a conditional generative model and a pre-trained conditional mean estimator.
We demonstrate the outstanding ability of CARD in conditional distribution prediction with both toy examples and real-world datasets.
arXiv Detail & Related papers (2022-06-15T03:30:38Z) - Learning the Structure of Large Networked Systems Obeying Conservation
Laws [5.86054250638667]
Conservation laws in networked systems may be modeled as balance equations of the form $X = B* Y$.
In several practical systems, the network structure is often unknown and needs to be estimated from data.
We propose a new $ell_1$-regularized maximum likelihood estimator for this problem in the high-dimensional regime.
arXiv Detail & Related papers (2022-06-14T18:16:52Z) - Causal Inference Despite Limited Global Confounding via Mixture Models [4.721845865189578]
A finite $k$-mixture of such models is graphically represented by a larger graph.
We give the first algorithm to learn mixtures of non-empty DAGs.
arXiv Detail & Related papers (2021-12-22T01:04:50Z) - GFlowNet Foundations [66.69854262276391]
Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context.
We show a number of additional theoretical properties of GFlowNets.
arXiv Detail & Related papers (2021-11-17T17:59:54Z) - Reward-Free Model-Based Reinforcement Learning with Linear Function
Approximation [92.99933928528797]
We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs)
In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy.
We show that to obtain an $epsilon$-optimal policy for arbitrary reward function, UCRL-RFE needs to sample at most $tilde O(H4d(H + d)epsilon-2)$ episodes.
arXiv Detail & Related papers (2021-10-12T23:03:58Z) - Robust Model Selection and Nearly-Proper Learning for GMMs [26.388358539260473]
In learning theory, a standard assumption is that the data is generated from a finite mixture model. But what happens when the number of components is not known in advance?
We are able to approximately determine the minimum number of components needed to fit the distribution within a logarithmic factor.
arXiv Detail & Related papers (2021-06-05T01:58:40Z) - Adjusted chi-square test for degree-corrected block models [13.122543280692641]
We propose a goodness-of-fit test for degree-corrected block models (DCSBM)
We show that a simple adjustment allows the statistic to converge in distribution, under null, as long as the harmonic mean of $d_i$ grows to infinity.
Our distributional results are nonasymptotic, with explicit constants, providing finite-sample bounds on the Kolmogorov-Smirnov distance to the target distribution.
arXiv Detail & Related papers (2020-12-30T05:20:59Z) - Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.