Inconsistency of cross-validation for structure learning in Gaussian
graphical models
- URL: http://arxiv.org/abs/2312.17047v1
- Date: Thu, 28 Dec 2023 14:47:28 GMT
- Title: Inconsistency of cross-validation for structure learning in Gaussian
graphical models
- Authors: Zhao Lyu, Wai Ming Tai, Mladen Kolar, Bryon Aragam
- Abstract summary: Cross-validation to discern the structure of a Gaussian graphical model is a challenging endeavor.
We provide finite-sample bounds on the probability that the Lasso estimator for the neighborhood of a node misidentifies the neighborhood.
We conduct an empirical investigation of this inconsistency by contrasting our outcomes with other commonly used information criteria.
- Score: 20.332261273013913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite numerous years of research into the merits and trade-offs of various
model selection criteria, obtaining robust results that elucidate the behavior
of cross-validation remains a challenging endeavor. In this paper, we highlight
the inherent limitations of cross-validation when employed to discern the
structure of a Gaussian graphical model. We provide finite-sample bounds on the
probability that the Lasso estimator for the neighborhood of a node within a
Gaussian graphical model, optimized using a prediction oracle, misidentifies
the neighborhood. Our results pertain to both undirected and directed acyclic
graphs, encompassing general, sparse covariance structures. To support our
theoretical findings, we conduct an empirical investigation of this
inconsistency by contrasting our outcomes with other commonly used information
criteria through an extensive simulation study. Given that many algorithms
designed to learn the structure of graphical models require hyperparameter
selection, the precise calibration of this hyperparameter is paramount for
accurately estimating the inherent structure. Consequently, our observations
shed light on this widely recognized practical challenge.
Related papers
- Statistical ranking with dynamic covariates [6.729750785106628]
We introduce an efficient alternating algorithm to compute the likelihood estimator (MLE)
A comprehensive numerical study is conducted to corroborate our theoretical findings and demonstrate the application of the proposed model to real-world datasets, including horse racing and tennis competitions.
arXiv Detail & Related papers (2024-06-24T10:26:05Z) - Generalized Criterion for Identifiability of Additive Noise Models Using Majorization [7.448620208767376]
We introduce a novel identifiability criterion for directed acyclic graph (DAG) models.
We demonstrate that this criterion extends and generalizes existing identifiability criteria.
We present a new algorithm for learning a topological ordering of variables.
arXiv Detail & Related papers (2024-04-08T02:18:57Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Semi-Supervised Clustering of Sparse Graphs: Crossing the
Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data.
No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold.
We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Crime Prediction with Graph Neural Networks and Multivariate Normal
Distributions [18.640610803366876]
We tackle the sparsity problem in high resolution by leveraging the flexible structure of graph convolutional networks (GCNs)
We build our model with Graph Convolutional Gated Recurrent Units (Graph-ConvGRU) to learn spatial, temporal, and categorical relations.
We show that our model is not only generative but also precise.
arXiv Detail & Related papers (2021-11-29T17:37:01Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - Identification of Latent Variables From Graphical Model Residuals [0.0]
We present a novel method to control for the latent space when estimating a DAG by iteratively deriving proxies for the latent space from the residuals of the inferred model.
We show that any improvement of prediction of an outcome is intrinsically capped and cannot rise beyond a certain limit as compared to the confounded model.
arXiv Detail & Related papers (2021-01-07T02:28:49Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.