Efficient Differentiable Discovery of Causal Order
- URL: http://arxiv.org/abs/2410.08787v1
- Date: Fri, 11 Oct 2024 13:11:55 GMT
- Title: Efficient Differentiable Discovery of Causal Order
- Authors: Mathieu Chevalley, Arash Mehrjou, Patrick Schwab,
- Abstract summary: Intersort is a score-based method to discover causal order of variables.
We reformulate Intersort using differentiable sorting and ranking techniques.
Our work opens the door to efficiently incorporating regularization for causal order into the training of differentiable models.
- Score: 14.980926991441342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the algorithm Intersort, Chevalley et al. (2024) proposed a score-based method to discover the causal order of variables in a Directed Acyclic Graph (DAG) model, leveraging interventional data to outperform existing methods. However, as a score-based method over the permutahedron, Intersort is computationally expensive and non-differentiable, limiting its ability to be utilised in problems involving large-scale datasets, such as those in genomics and climate models, or to be integrated into end-to-end gradient-based learning frameworks. We address this limitation by reformulating Intersort using differentiable sorting and ranking techniques. Our approach enables scalable and differentiable optimization of causal orderings, allowing the continuous score function to be incorporated as a regularizer in downstream tasks. Empirical results demonstrate that causal discovery algorithms benefit significantly from regularizing on the causal order, underscoring the effectiveness of our method. Our work opens the door to efficiently incorporating regularization for causal order into the training of differentiable models and thereby addresses a long-standing limitation of purely associational supervised learning.
Related papers
- $ψ$DAG: Projected Stochastic Approximation Iteration for DAG Structure Learning [6.612096312467342]
Learning the structure of Directed A Graphs (DAGs) presents a significant challenge due to the vast search space of possible graphs, which scales with the number of nodes.
Recent advancements have redefined this problem as a continuous optimization task by incorporating differentiable a exponentiallyity constraints.
We present a novel framework for learning DAGs, employing a Approximation approach integrated with Gradient Descent (SGD)-based optimization techniques.
arXiv Detail & Related papers (2024-10-31T12:13:11Z) - Causal Temporal Regime Structure Learning [49.77103348208835]
We introduce a new optimization-based method (linear) that concurrently learns the Directed Acyclic Graph (DAG) for each regime.
We conduct extensive experiments and show that our method consistently outperforms causal discovery models across various settings.
arXiv Detail & Related papers (2023-11-02T17:26:49Z) - Provably Efficient Learning in Partially Observable Contextual Bandit [4.910658441596583]
We show how causal bounds can be applied to improving classical bandit algorithms.
This research has the potential to enhance the performance of contextual bandit agents in real-world applications.
arXiv Detail & Related papers (2023-08-07T13:24:50Z) - Active Learning for Optimal Intervention Design in Causal Models [11.294389953686945]
We develop a causal active learning strategy to identify interventions that are optimal, as measured by the discrepancy between the post-interventional mean of the distribution and a desired target mean.
We apply our approach to both synthetic data and single-cell transcriptomic data from Perturb-CITE-seq experiments to identify optimal perturbations that induce a specific cell state transition.
arXiv Detail & Related papers (2022-09-10T20:40:30Z) - An Accelerated Doubly Stochastic Gradient Method with Faster Explicit
Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems.
We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Efficient Neural Causal Discovery without Acyclicity Constraints [30.08586535981525]
We present ENCO, an efficient structure learning method for directed, acyclic causal graphs.
In experiments, we show that ENCO can efficiently recover graphs with hundreds of nodes, an order of magnitude larger than what was previously possible.
arXiv Detail & Related papers (2021-07-22T07:01:41Z) - Efficient Learning of Generative Models via Finite-Difference Score
Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference.
Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z) - Learning High Order Feature Interactions with Fine Control Kernels [12.5433010409486]
We provide a methodology for learning sparse statistical models that use as features all possible multiplicative interactions.
We also introduce an algorithmic paradigm, the Fine Control Kernel framework, so named because it is based on Fenchel Duality.
arXiv Detail & Related papers (2020-02-09T06:29:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.