Conditional Matrix Flows for Gaussian Graphical Models
- URL: http://arxiv.org/abs/2306.07255v2
- Date: Thu, 16 Nov 2023 13:54:59 GMT
- Title: Conditional Matrix Flows for Gaussian Graphical Models
- Authors: Marcello Massimo Negri, F. Arend Torres and Volker Roth
- Abstract summary: We propose a general framework for variation inference matrix GG-Flow in which the benefits of frequent keyization and Bayesian inference are studied.
As a train of the sparse for any $lambda$ and any $l_q$ (pse-) and for any $l_q$ (pse-) we have to (i) train the limit for any $lambda$ and any $l_q$ (pse-) and (like for the selection) the frequent solution.
- Score: 1.6435014180036467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Studying conditional independence among many variables with few observations
is a challenging task. Gaussian Graphical Models (GGMs) tackle this problem by
encouraging sparsity in the precision matrix through $l_q$ regularization with
$q\leq1$. However, most GMMs rely on the $l_1$ norm because the objective is
highly non-convex for sub-$l_1$ pseudo-norms. In the frequentist formulation,
the $l_1$ norm relaxation provides the solution path as a function of the
shrinkage parameter $\lambda$. In the Bayesian formulation, sparsity is instead
encouraged through a Laplace prior, but posterior inference for different
$\lambda$ requires repeated runs of expensive Gibbs samplers. Here we propose a
general framework for variational inference with matrix-variate Normalizing
Flow in GGMs, which unifies the benefits of frequentist and Bayesian
frameworks. As a key improvement on previous work, we train with one flow a
continuum of sparse regression models jointly for all regularization parameters
$\lambda$ and all $l_q$ norms, including non-convex sub-$l_1$ pseudo-norms.
Within one model we thus have access to (i) the evolution of the posterior for
any $\lambda$ and any $l_q$ (pseudo-) norm, (ii) the marginal log-likelihood
for model selection, and (iii) the frequentist solution paths through simulated
annealing in the MAP limit.
Related papers
- Iterative Reweighted Framework Based Algorithms for Sparse Linear Regression with Generalized Elastic Net Penalty [0.3124884279860061]
elastic net penalty is frequently employed in high-dimensional statistics for parameter regression and variable selection.
empirical evidence has shown that the $ell_q$-norm penalty often provides better regression compared to the $ell_r$-norm penalty.
We develop two efficient algorithms based on the locally Lipschitz continuous $epsilon$-approximation to $ell_q$-norm.
arXiv Detail & Related papers (2024-11-22T11:55:37Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Dynamical System Identification, Model Selection and Model Uncertainty Quantification by Bayesian Inference [0.8388591755871735]
This study presents a Bayesian maximum textitaposteriori (MAP) framework for dynamical system identification from time-series data.
arXiv Detail & Related papers (2024-01-30T12:16:52Z) - Kernelized Normalizing Constant Estimation: Bridging Bayesian Quadrature
and Bayesian Optimization [51.533164528799084]
We show that to estimate the normalizing constant within a small relative error, the level of difficulty depends on the value of $lambda$.
We find that this pattern holds true even when the function evaluations are noisy.
arXiv Detail & Related papers (2024-01-11T07:45:09Z) - Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly.
We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability.
Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z) - Variational Inference for Bayesian Bridge Regression [0.0]
We study the implementation of Automatic Differentiation Variational inference (ADVI) for Bayesian inference on regression models with bridge penalization.
The bridge approach uses $ell_alpha$ norm, with $alpha in (0, +infty)$ to define a penalization on large values of the regression coefficients.
We illustrate the approach on non-parametric regression models with B-splines, although the method works seamlessly for other choices of basis functions.
arXiv Detail & Related papers (2022-05-19T12:29:09Z) - $p$-Generalized Probit Regression and Scalable Maximum Likelihood
Estimation via Sketching and Coresets [74.37849422071206]
We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses.
We show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+varepsilon)$ on large data.
arXiv Detail & Related papers (2022-03-25T10:54:41Z) - Optimal Online Generalized Linear Regression with Stochastic Noise and
Its Application to Heteroscedastic Bandits [88.6139446295537]
We study the problem of online generalized linear regression in the setting of a generalized linear model with possibly unbounded additive noise.
We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise.
We propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
arXiv Detail & Related papers (2022-02-28T08:25:26Z) - Last iterate convergence of SGD for Least-Squares in the Interpolation
regime [19.05750582096579]
We study the noiseless model in the fundamental least-squares setup.
We assume that an optimum predictor fits perfectly inputs and outputs $langle theta_*, phi(X) rangle = Y$, where $phi(X)$ stands for a possibly infinite dimensional non-linear feature map.
arXiv Detail & Related papers (2021-02-05T14:02:20Z) - Estimating Stochastic Linear Combination of Non-linear Regressions
Efficiently and Scalably [23.372021234032363]
We show that when the sub-sample sizes are large then the estimation errors will be sacrificed by too much.
To the best of our knowledge, this is the first work that and guarantees for the lineartext+Stochasticity model.
arXiv Detail & Related papers (2020-10-19T07:15:38Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.