Sparse Gaussian Graphical Models with Discrete Optimization:
Computational and Statistical Perspectives
- URL: http://arxiv.org/abs/2307.09366v1
- Date: Tue, 18 Jul 2023 15:49:02 GMT
- Title: Sparse Gaussian Graphical Models with Discrete Optimization:
Computational and Statistical Perspectives
- Authors: Kayhan Behdin, Wenyu Chen, Rahul Mazumder
- Abstract summary: We consider the problem of learning a sparse graph underlying an undirected Gaussian graphical model.
We propose GraphL0BnB, a new estimator based on an $ell_0$-penalized version of the pseudolikelihood function.
Our numerical experiments on real/synthetic datasets suggest that our method can solve, to near-optimality, problem instances with $p = 104$.
- Score: 8.403841349300103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of learning a sparse graph underlying an undirected
Gaussian graphical model, a key problem in statistical machine learning. Given
$n$ samples from a multivariate Gaussian distribution with $p$ variables, the
goal is to estimate the $p \times p$ inverse covariance matrix (aka precision
matrix), assuming it is sparse (i.e., has a few nonzero entries). We propose
GraphL0BnB, a new estimator based on an $\ell_0$-penalized version of the
pseudolikelihood function, while most earlier approaches are based on the
$\ell_1$-relaxation. Our estimator can be formulated as a convex mixed integer
program (MIP) which can be difficult to compute at scale using off-the-shelf
commercial solvers. To solve the MIP, we propose a custom nonlinear
branch-and-bound (BnB) framework that solves node relaxations with tailored
first-order methods. As a by-product of our BnB framework, we propose
large-scale solvers for obtaining good primal solutions that are of independent
interest. We derive novel statistical guarantees (estimation and variable
selection) for our estimator and discuss how our approach improves upon
existing estimators. Our numerical experiments on real/synthetic datasets
suggest that our method can solve, to near-optimality, problem instances with
$p = 10^4$ -- corresponding to a symmetric matrix of size $p \times p$ with
$p^2/2$ binary variables. We demonstrate the usefulness of GraphL0BnB versus
various state-of-the-art approaches on a range of datasets.
Related papers
- Computational-Statistical Gaps for Improper Learning in Sparse Linear Regression [4.396860522241307]
We show that an efficient learning algorithm for sparse linear regression can be used to solve sparse PCA problems with a negative spike.
We complement our reduction with low-degree and statistical query lower bounds for the sparse problems from which we reduce.
arXiv Detail & Related papers (2024-02-21T19:55:01Z) - Compressive Recovery of Sparse Precision Matrices [5.557600489035657]
We consider the problem of learning a graph modeling the statistical relations of the $d$ variables from a dataset with $n$ samples $X in mathbbRn times d$.
We show that it is possible to estimate it from a sketch of size $m=Omegaleft((d+2k)log(d)right)$ where $k$ is the maximal number of edges of the underlying graph.
We investigate the possibility of achieving practical recovery with an iterative algorithm based on the graphical lasso, viewed as a specific denoiser.
arXiv Detail & Related papers (2023-11-08T13:29:08Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Learning Sparse Graph with Minimax Concave Penalty under Gaussian Markov
Random Fields [51.07460861448716]
This paper presents a convex-analytic framework to learn from data.
We show that a triangular convexity decomposition is guaranteed by a transform of the corresponding to its upper part.
arXiv Detail & Related papers (2021-09-17T17:46:12Z) - Estimating Stochastic Linear Combination of Non-linear Regressions
Efficiently and Scalably [23.372021234032363]
We show that when the sub-sample sizes are large then the estimation errors will be sacrificed by too much.
To the best of our knowledge, this is the first work that and guarantees for the lineartext+Stochasticity model.
arXiv Detail & Related papers (2020-10-19T07:15:38Z) - Convergence of Sparse Variational Inference in Gaussian Processes
Regression [29.636483122130027]
We show that a method with an overall computational cost of $mathcalO(log N)2D(loglog N)2)$ can be used to perform inference.
arXiv Detail & Related papers (2020-08-01T19:23:34Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z) - Sparse Regression at Scale: Branch-and-Bound rooted in First-Order
Optimization [6.037383467521294]
We present a new exact MIP framework for $ell_0$ regularized regression.
Our framework can scale to $p sim 107$, achieving speedups of at least $5000$x.
We open source the implementation through our toolkit L0BnB.
arXiv Detail & Related papers (2020-04-13T18:45:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.