ProDAG: Projection-Induced Variational Inference for Directed Acyclic Graphs
- URL: http://arxiv.org/abs/2405.15167v3
- Date: Mon, 14 Oct 2024 01:35:34 GMT
- Title: ProDAG: Projection-Induced Variational Inference for Directed Acyclic Graphs
- Authors: Ryan Thompson, Edwin V. Bonilla, Robert Kohn,
- Abstract summary: Directed acyclic graph (DAG) learning is a rapidly expanding field of research.
It remains statistically and computationally challenging to learn a single (point estimate) DAG from data, let alone provide uncertainty quantification.
Our article addresses the difficult task of quantifying graph uncertainty by developing a Bayesian variational inference framework based on novel distributions that have support directly on the space of DAGs.
- Score: 8.556906995059324
- License:
- Abstract: Directed acyclic graph (DAG) learning is a rapidly expanding field of research. Though the field has witnessed remarkable advances over the past few years, it remains statistically and computationally challenging to learn a single (point estimate) DAG from data, let alone provide uncertainty quantification. Our article addresses the difficult task of quantifying graph uncertainty by developing a Bayesian variational inference framework based on novel distributions that have support directly on the space of DAGs. The distributions, which we use to form our prior and variational posterior, are induced by a projection operation, whereby an arbitrary continuous distribution is projected onto the space of sparse weighted acyclic adjacency matrices (matrix representations of DAGs) with probability mass on exact zeros. Though the projection constitutes a combinatorial optimization problem, it is solvable at scale via recently developed techniques that reformulate acyclicity as a continuous constraint. We empirically demonstrate that our method, ProDAG, can deliver accurate inference and often outperforms existing state-of-the-art alternatives.
Related papers
- Scalable Variational Causal Discovery Unconstrained by Acyclicity [6.954510776782872]
We propose a scalable Bayesian approach to learn the posterior distribution over causal graphs given observational data.
We introduce a novel differentiable DAG sampling method that can generate a valid acyclic causal graph.
We are able to model the posterior distribution over causal graphs using a simple variational distribution over a continuous domain.
arXiv Detail & Related papers (2024-07-06T07:56:23Z) - Variational DAG Estimation via State Augmentation With Stochastic Permutations [16.57658783816741]
Estimating the structure of a Bayesian network from observational data is a statistically and computationally hard problem.
From a probabilistic inference perspective, the main challenges are (i) representing distributions over graphs that satisfy the DAG constraint and (ii) estimating a posterior over the underlying space.
We propose an approach that addresses these challenges by formulating a joint distribution on an augmented space of DAGs and permutations.
arXiv Detail & Related papers (2024-02-04T23:51:04Z) - BayesDAG: Gradient-Based Posterior Inference for Causal Discovery [30.027520859604955]
We introduce a scalable causal discovery framework based on a combination of Markov Chain Monte Carlo and Variational Inference.
Our approach directly samples DAGs from the posterior without requiring any DAG regularization.
We derive a novel equivalence to the permutation-based DAG learning, which opens up possibilities of using any relaxed estimator defined over permutations.
arXiv Detail & Related papers (2023-07-26T02:34:13Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS)
This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z) - Causal Graph Discovery from Self and Mutually Exciting Time Series [10.410454851418548]
We develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program.
We demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs)
arXiv Detail & Related papers (2023-01-26T16:15:27Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Variational Causal Networks: Approximate Bayesian Inference over Causal
Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs.
In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.