Related papers: On the Role of Sparsity and DAG Constraints for Learning Linear DAGs

On the Role of Sparsity and DAG Constraints for Learning Linear DAGs

URL: http://arxiv.org/abs/2006.10201v3
Date: Fri, 8 Jan 2021 20:05:34 GMT
Title: On the Role of Sparsity and DAG Constraints for Learning Linear DAGs
Authors: Ignavier Ng, AmirEmad Ghassami, Kun Zhang
Abstract summary: We study the role of sparsity and DAG constraints for learning DAG models in the linear Gaussian and non-Gaussian cases. We propose a likelihood-based score function, and show that one only has to apply soft sparsity and DAG constraints to learn a DAG equivalent to the ground truth DAG.
Score: 16.97675762810828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning graphical structures based on Directed Acyclic Graphs (DAGs) is a challenging problem, partly owing to the large search space of possible graphs. A recent line of work formulates the structure learning problem as a continuous constrained optimization task using the least squares objective and an algebraic characterization of DAGs. However, the formulation requires a hard DAG constraint and may lead to optimization difficulties. In this paper, we study the asymptotic role of the sparsity and DAG constraints for learning DAG models in the linear Gaussian and non-Gaussian cases, and investigate their usefulness in the finite sample regime. Based on the theoretical results, we formulate a likelihood-based score function, and show that one only has to apply soft sparsity and DAG constraints to learn a DAG equivalent to the ground truth DAG. This leads to an unconstrained optimization problem that is much easier to solve. Using gradient-based optimization and GPU acceleration, our procedure can easily handle thousands of nodes while retaining a high accuracy. Extensive experiments validate the effectiveness of our proposed method and show that the DAG-penalized likelihood objective is indeed favorable over the least squares one with the hard DAG constraint.

Related papers

Learning Causal Graphs at Scale: A Foundation Model Approach [28.966180222166766]
We propose Attention-DAG (ADAG), a novel attention-mechanism-based architecture for learning multiple linear Structural Equation Models (SEMs)<n>ADAG learns the mapping from observed data to both graph structure and parameters via a nonlinear attention-based kernel.<n>We evaluate our proposed approach on benchmark synthetic datasets and find that ADAG achieves substantial improvements in both DAG learning accuracy and zero-shot inference efficiency.
arXiv Detail & Related papers (2025-06-23T04:41:02Z)
Analytic DAG Constraints for Differentiable DAG Learning [83.93320658222717]
We develop a theory to establish a connection between analytic functions and DAG constraints. We show that analytic functions from the set $f(x) = c_0 + sum_i=1inftyc_ixi | forall i > 0, c_i > 0; r = lim_irightarrow inftyc_i/c_i+1 > 0$ can be employed to formulate effective DAG constraints.
arXiv Detail & Related papers (2025-03-24T23:51:35Z)
$ψ$DAG: Projected Stochastic Approximation Iteration for DAG Structure Learning [6.612096312467342]
Learning the structure of Directed A Graphs (DAGs) presents a significant challenge due to the vast search space of possible graphs, which scales with the number of nodes. Recent advancements have redefined this problem as a continuous optimization task by incorporating differentiable a exponentiallyity constraints. We present a novel framework for learning DAGs, employing a Approximation approach integrated with Gradient Descent (SGD)-based optimization techniques.
arXiv Detail & Related papers (2024-10-31T12:13:11Z)
Non-negative Weighted DAG Structure Learning [12.139158398361868]
We address the problem of learning the true DAGs from nodal observations. We propose a DAG recovery algorithm based on the method that is guaranteed to return ar.
arXiv Detail & Related papers (2024-09-12T09:41:29Z)
On the Generalization Capability of Temporal Graph Learning Algorithms: Theoretical Insights and a Simpler Method [59.52204415829695]
Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications. This paper investigates the generalization ability of different TGL algorithms. We propose a simplified TGL network, which enjoys a small generalization error, improved overall performance, and lower model complexity.
arXiv Detail & Related papers (2024-02-26T08:22:22Z)
Discovering Dynamic Causal Space for DAG Structure Learning [64.763763417533]
We propose a dynamic causal space for DAG structure learning, coined CASPER. It integrates the graph structure into the score function as a new measure in the causal space to faithfully reflect the causal distance between estimated and ground truth DAG.
arXiv Detail & Related papers (2023-06-05T12:20:40Z)
Truncated Matrix Power Iteration for Differentiable DAG Learning [47.69479930501961]
We propose a novel DAG learning method with efficient truncated matrix power to approximate series-based DAG constraints. Empirically, our DAG learning method outperforms the previous state-of-the-arts in various settings.
arXiv Detail & Related papers (2022-08-30T23:56:12Z)
BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG) Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data. We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z)
Multi-task Learning of Order-Consistent Causal Graphs [59.9575145128345]
We consider the problem of discovering $K related Gaussian acyclic graphs (DAGs) Under multi-task learning setting, we propose a $l_1/l$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order.
arXiv Detail & Related papers (2021-11-03T22:10:18Z)
Learning Large DAGs by Combining Continuous Optimization and Feedback Arc Set Heuristics [0.3553493344868413]
We propose two scalable NPs for learning DAGs in a linear structural equation case. Our methods learn the DAG by alternating between unconstrained gradient descent-based step to optimize an objective function. Thanks to this decoupling, our methods scale up beyond thousands of variables.
arXiv Detail & Related papers (2021-07-01T16:10:21Z)
DAGs with No Curl: An Efficient DAG Structure Learning Approach [62.885572432958504]
Recently directed acyclic graph (DAG) structure learning is formulated as a constrained continuous optimization problem with continuous acyclicity constraints. We propose a novel learning framework to model and learn the weighted adjacency matrices in the DAG space directly. We show that our method provides comparable accuracy but better efficiency than baseline DAG structure learning methods on both linear and generalized structural equation models.
arXiv Detail & Related papers (2021-06-14T07:11:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.