Integer Programming for Learning Directed Acyclic Graphs from Non-identifiable Gaussian Models
- URL: http://arxiv.org/abs/2404.12592v2
- Date: Mon, 5 Aug 2024 03:52:54 GMT
- Title: Integer Programming for Learning Directed Acyclic Graphs from Non-identifiable Gaussian Models
- Authors: Tong Xu, Armeen Taeb, Simge Küçükyavuz, Ali Shojaie,
- Abstract summary: We study the problem of learning directed acyclic graphs from continuous observational data.
We develop a mixed-integer programming framework for learning medium-sized problems.
Our method outperforms state-of-the-art algorithms and is robust to noise heteroscedasticity.
- Score: 6.54203362045253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of learning directed acyclic graphs from continuous observational data, generated according to a linear Gaussian structural equation model. State-of-the-art structure learning methods for this setting have at least one of the following shortcomings: i) they cannot provide optimality guarantees and can suffer from learning sub-optimal models; ii) they rely on the stringent assumption that the noise is homoscedastic, and hence the underlying model is fully identifiable. We overcome these shortcomings and develop a computationally efficient mixed-integer programming framework for learning medium-sized problems that accounts for arbitrary heteroscedastic noise. We present an early stopping criterion under which we can terminate the branch-and-bound procedure to achieve an asymptotically optimal solution and establish the consistency of this approximate solution. In addition, we show via numerical experiments that our method outperforms state-of-the-art algorithms and is robust to noise heteroscedasticity, whereas the performance of some competing methods deteriorates under strong violations of the identifiability assumption. The software implementation of our method is available as the Python package \emph{micodag}.
Related papers
- Towards Learning Stochastic Population Models by Gradient Descent [0.0]
We show that simultaneous estimation of parameters and structure poses major challenges for optimization procedures.
We demonstrate accurate estimation of models but find that enforcing the inference of parsimonious, interpretable models drastically increases the difficulty.
arXiv Detail & Related papers (2024-04-10T14:38:58Z) - Geometry-Aware Approaches for Balancing Performance and Theoretical
Guarantees in Linear Bandits [6.907555940790131]
Thompson sampling and Greedy demonstrate promising empirical performance, yet this contrasts with their pessimistic theoretical regret bounds.
We propose a new data-driven technique that tracks the geometric properties of the uncertainty ellipsoid.
We identify and course-correct" problem instances in which the base algorithms perform poorly.
arXiv Detail & Related papers (2023-06-26T17:38:45Z) - Robust Model Selection of Gaussian Graphical Models [16.933125281564163]
Noise-corrupted samples present significant challenges in graphical model selection.
We propose an algorithm which provably recovers the underlying graph up to the identified ambiguity.
This information is useful in a range of real-world problems, including power grids, social networks, protein-protein interactions, and neural structures.
arXiv Detail & Related papers (2022-11-10T16:50:50Z) - Deep Active Learning with Noise Stability [24.54974925491753]
Uncertainty estimation for unlabeled data is crucial to active learning.
We propose a novel algorithm that leverages noise stability to estimate data uncertainty.
Our method is generally applicable in various tasks, including computer vision, natural language processing, and structural data analysis.
arXiv Detail & Related papers (2022-05-26T13:21:01Z) - Score matching enables causal discovery of nonlinear additive noise
models [63.93669924730725]
We show how to design a new generation of scalable causal discovery methods.
We propose a new efficient method for approximating the score's Jacobian, enabling to recover the causal graph.
arXiv Detail & Related papers (2022-03-08T21:34:46Z) - A Priori Denoising Strategies for Sparse Identification of Nonlinear
Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements.
We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Outlier-Robust Learning of Ising Models Under Dobrushin's Condition [57.89518300699042]
We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted.
Our main result is to provide the first computationally efficient robust learning algorithm for this problem with near-optimal error guarantees.
arXiv Detail & Related papers (2021-02-03T18:00:57Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z) - Consistent Second-Order Conic Integer Programming for Learning Bayesian
Networks [2.7473982588529653]
We study the problem of learning the sparse DAG structure of a BN from continuous observational data.
The optimal solution to this mathematical program is known to have desirable statistical properties under certain conditions.
We propose a concrete early stopping criterion to terminate the branch-and-bound process in order to obtain a near-optimal solution.
arXiv Detail & Related papers (2020-05-29T00:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.