Related papers: Fast Scalable and Accurate Discovery of DAGs Using the Best Order Score Search and Grow-Shrink Trees

Fast Scalable and Accurate Discovery of DAGs Using the Best Order Score Search and Grow-Shrink Trees

URL: http://arxiv.org/abs/2310.17679v1
Date: Thu, 26 Oct 2023 10:03:12 GMT
Title: Fast Scalable and Accurate Discovery of DAGs Using the Best Order Score Search and Grow-Shrink Trees
Authors: Bryan Andrews, Joseph Ramsey, Ruben Sanchez-Romero, Jazmin Camchong, Erich Kummerfeld
Abstract summary: Best order score search (BOSS) and grow-shrink trees ( GSTs) for learning directed acyclic graphs (DAGs) We introduce the best order score search (BOSS) and grow-shrink trees ( GSTs) for learning directed acyclic graphs (DAGs)
Score: 2.667401221288548
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning graphical conditional independence structures is an important machine learning problem and a cornerstone of causal discovery. However, the accuracy and execution time of learning algorithms generally struggle to scale to problems with hundreds of highly connected variables -- for instance, recovering brain networks from fMRI data. We introduce the best order score search (BOSS) and grow-shrink trees (GSTs) for learning directed acyclic graphs (DAGs) in this paradigm. BOSS greedily searches over permutations of variables, using GSTs to construct and score DAGs from permutations. GSTs efficiently cache scores to eliminate redundant calculations. BOSS achieves state-of-the-art performance in accuracy and execution time, comparing favorably to a variety of combinatorial and gradient-based learning algorithms under a broad range of conditions. To demonstrate its practicality, we apply BOSS to two sets of resting-state fMRI data: simulated data with pseudo-empirical noise distributions derived from randomized empirical fMRI cortical signals and clinical data from 3T fMRI scans processed into cortical parcels. BOSS is available for use within the TETRAD project which includes Python and R wrappers.

Related papers

Score-matching-based Structure Learning for Temporal Data on Networks [17.166362605356074]
Causal discovery is a crucial initial step in establishing causality from empirical data and background knowledge. Current score-matching-based algorithms are primarily designed to analyze independent and identically distributed (i.i.d.) data. We have developed a new parent-finding subroutine for leaf nodes in DAGs, significantly accelerating the most time-consuming part of the process: the pruning step.
arXiv Detail & Related papers (2024-12-10T12:36:35Z)
SpaRG: Sparsely Reconstructed Graphs for Generalizable fMRI Analysis [8.489318619991534]
Deep learning can help uncover patterns in resting-state functional Magnetic Resonance Imaging (rsfMRI) associated with psychiatric disorders and personal traits. Yet the problem of interpreting deep learning findings is rarely more evident than in fMRI analyses. We propose a simple approach to mitigate these challenges grounded on sparsification and self-supervision.
arXiv Detail & Related papers (2024-09-24T18:35:57Z)
Large-Scale Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations. By employing a local-inference strategy, our approach scales with linear complexity in the number of variables, efficiently scaling up to thousands of variables. Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z)
Coordinated Multi-Neighborhood Learning on a Directed Acyclic Graph [6.727984016678534]
Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence. It is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions. This paper develops a new constraint-based method for estimating the local structure around multiple user-specified target nodes.
arXiv Detail & Related papers (2024-05-24T08:49:43Z)
Combating Bilateral Edge Noise for Robust Link Prediction [56.43882298843564]
We propose an information-theory-guided principle, Robust Graph Information Bottleneck (RGIB), to extract reliable supervision signals and avoid representation collapse. Two instantiations, RGIB-SSL and RGIB-REP, are explored to leverage the merits of different methodologies. Experiments on six datasets and three GNNs with diverse noisy scenarios verify the effectiveness of our RGIB instantiations.
arXiv Detail & Related papers (2023-11-02T12:47:49Z)
CUTS: Neural Causal Discovery from Irregular Time-Series Data [27.06531262632836]
Causal discovery from time-series data has been a central task in machine learning. We present CUTS, a neural Granger causal discovery algorithm to jointly impute unobserved data points and build causal graphs. Our approach constitutes a promising step towards applying causal discovery to real applications with non-ideal observations.
arXiv Detail & Related papers (2023-02-15T04:16:34Z)
Personalized Decentralized Multi-Task Learning Over Dynamic Communication Graphs [59.96266198512243]
We propose a decentralized and federated learning algorithm for tasks that are positively and negatively correlated. Our algorithm uses gradients to calculate the correlations among tasks automatically, and dynamically adjusts the communication graph to connect mutually beneficial tasks and isolate those that may negatively impact each other. We conduct experiments on a synthetic Gaussian dataset and a large-scale celebrity attributes (CelebA) dataset.
arXiv Detail & Related papers (2022-12-21T18:58:24Z)
Decision Forest Based EMG Signal Classification with Low Volume Dataset Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience. We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z)
BCDAG: An R package for Bayesian structure and Causal learning of Gaussian DAGs [77.34726150561087]
We introduce the R package for causal discovery and causal effect estimation from observational data. Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, the number of variables in the dataset. We then illustrate the main functions and algorithms on both real and simulated datasets.
arXiv Detail & Related papers (2022-01-28T09:30:32Z)
Causal Discovery from Sparse Time-Series Data Using Echo State Network [0.0]
Causal discovery between collections of time-series data can help diagnose causes of symptoms and hopefully prevent faults before they occur. We propose a new system comprised of two parts, the first part fills missing data with a Gaussian Process Regression, and the second part leverages an Echo State Network. We report on their corresponding Matthews Correlation Coefficient(MCC) and Receiver Operating Characteristic curves (ROC) and show that the proposed system outperforms existing algorithms.
arXiv Detail & Related papers (2022-01-09T05:55:47Z)
Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods. We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z)
CoarSAS2hvec: Heterogeneous Information Network Embedding with Balanced Network Sampling [0.0]
Heterogeneous information network (HIN) embedding aims to find the representations of nodes that preserve the proximity between entities of different nature. A family of approaches that are wildly adopted applies random walk to generate a sequence of heterogeneous context. Due to the multipartite graph structure of HIN, hub nodes tend to be over-represented in the sampled sequence, giving rise to imbalanced samples of the network.
arXiv Detail & Related papers (2021-10-12T08:34:39Z)
Efficient and Stable Graph Scattering Transforms via Pruning [86.76336979318681]
Graph scattering transforms ( GSTs) offer training-free deep GCN models that extract features from graph data. The price paid by GSTs is exponential complexity in space and time that increases with the number of layers. The present work addresses the complexity limitation of GSTs by introducing an efficient so-termed pruned (p) GST approach.
arXiv Detail & Related papers (2020-01-27T16:05:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.