Gaussian Graphical Model Selection for Huge Data via Minipatch Learning
- URL: http://arxiv.org/abs/2110.12067v1
- Date: Fri, 22 Oct 2021 21:06:48 GMT
- Title: Gaussian Graphical Model Selection for Huge Data via Minipatch Learning
- Authors: Tianyi Yao and Minjie Wang and Genevera I. Allen
- Abstract summary: We propose the Minipatch Graph (MPGraph) estimator to solve the problem of graphical model selection.
MPGraph is a generalization of thresholded graph estimators fit to tiny, random subsets of both the observations and the nodes.
We prove that our algorithm achieves finite-sample graph selection consistency.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Gaussian graphical models are essential unsupervised learning techniques to
estimate conditional dependence relationships between sets of nodes. While
graphical model selection is a well-studied problem with many popular
techniques, there are typically three key practical challenges: i) many
existing methods become computationally intractable in huge-data settings with
tens of thousands of nodes; ii) the need for separate data-driven tuning
hyperparameter selection procedures considerably adds to the computational
burden; iii) the statistical accuracy of selected edges often deteriorates as
the dimension and/or the complexity of the underlying graph structures
increase. We tackle these problems by proposing the Minipatch Graph (MPGraph)
estimator. Our approach builds upon insights from the latent variable graphical
model problem and utilizes ensembles of thresholded graph estimators fit to
tiny, random subsets of both the observations and the nodes, termed
minipatches. As estimates are fit on small problems, our approach is
computationally fast with integrated stability-based hyperparameter tuning.
Additionally, we prove that under certain conditions our MPGraph algorithm
achieves finite-sample graph selection consistency. We compare our approach to
state-of-the-art computational approaches to Gaussian graphical model selection
including the BigQUIC algorithm, and empirically demonstrate that our approach
is not only more accurate but also extensively faster for huge graph selection
problems.
Related papers
- Symmetry-preserving graph attention network to solve routing problems at
multiple resolutions [1.9304772860080408]
We introduce the first-ever completely equivariant model and training to solve problems.
It is essential to capture the multiscale structure of the input graph.
We propose a Multiresolution scheme in combination with Equi Graph Attention network (mEGAT) architecture.
arXiv Detail & Related papers (2023-10-24T06:22:20Z) - Large-scale Bayesian Structure Learning for Gaussian Graphical Models using Marginal Pseudo-likelihood [0.26249027950824516]
We introduce two novel Markov chain Monte Carlo (MCMC) search algorithms with a significantly lower computational cost than leading Bayesian approaches.
These algorithms can deliver reliable results in mere minutes on standard computers, even for large-scale problems with one thousand variables.
We also illustrate the practical utility of our methods on medium and large-scale applications from human and mice gene expression studies.
arXiv Detail & Related papers (2023-06-30T20:37:40Z) - Joint Graph Learning and Model Fitting in Laplacian Regularized
Stratified Models [5.933030735757292]
Laplacian regularized stratified models (LRSM) are models that utilize the explicit or implicit network structure of the sub-problems.
This paper shows the importance and sensitivity of graph weights in LRSM, and provably show that the sensitivity can be arbitrarily large.
We propose a generic approach to jointly learn the graph while fitting the model parameters by solving a single optimization problem.
arXiv Detail & Related papers (2023-05-04T06:06:29Z) - Semi-Supervised Clustering of Sparse Graphs: Crossing the
Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data.
No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold.
We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Score-based Generative Modeling of Graphs via the System of Stochastic
Differential Equations [57.15855198512551]
We propose a novel score-based generative model for graphs with a continuous-time framework.
We show that our method is able to generate molecules that lie close to the training distribution yet do not violate the chemical valency rule.
arXiv Detail & Related papers (2022-02-05T08:21:04Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - T-LoHo: A Bayesian Regularization Model for Structured Sparsity and
Smoothness on Graphs [0.0]
In graph-structured data, structured sparsity and smoothness tend to cluster together.
We propose a new prior for high dimensional parameters with graphical relations.
We use it to detect structured sparsity and smoothness simultaneously.
arXiv Detail & Related papers (2021-07-06T10:10:03Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Multilayer Clustered Graph Learning [66.94201299553336]
We use contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph.
Experiments show that our method leads to a clustered clusters w.r.t.
We learn a clustering algorithm for solving clustering problems.
arXiv Detail & Related papers (2020-10-29T09:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.