DoWhy: An End-to-End Library for Causal Inference
- URL: http://arxiv.org/abs/2011.04216v1
- Date: Mon, 9 Nov 2020 06:22:11 GMT
- Title: DoWhy: An End-to-End Library for Causal Inference
- Authors: Amit Sharma, Emre Kiciman
- Abstract summary: We describe DoWhy, an open-source Python library that is built with causal assumptions as its first-class citizens.
DoWhy presents an API for the four steps common to any causal analysis.
In particular, DoWhy implements a number of checks including placebo tests, bootstrap tests, and tests for unoberved confounding.
- Score: 16.764873959182765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In addition to efficient statistical estimators of a treatment's effect,
successful application of causal inference requires specifying assumptions
about the mechanisms underlying observed data and testing whether they are
valid, and to what extent. However, most libraries for causal inference focus
only on the task of providing powerful statistical estimators. We describe
DoWhy, an open-source Python library that is built with causal assumptions as
its first-class citizens, based on the formal framework of causal graphs to
specify and test causal assumptions. DoWhy presents an API for the four steps
common to any causal analysis---1) modeling the data using a causal graph and
structural assumptions, 2) identifying whether the desired effect is estimable
under the causal model, 3) estimating the effect using statistical estimators,
and finally 4) refuting the obtained estimate through robustness checks and
sensitivity analyses. In particular, DoWhy implements a number of robustness
checks including placebo tests, bootstrap tests, and tests for unoberved
confounding. DoWhy is an extensible library that supports interoperability with
other implementations, such as EconML and CausalML for the the estimation step.
The library is available at https://github.com/microsoft/dowhy
Related papers
- DeCaFlow: A Deconfounding Causal Generative Model [58.411886466157185]
We introduce DeCaFlow, a deconfounding causal generative model.<n>We extend previous results on causal estimation under hidden confounding to show that a single instance of DeCaFlow provides correct estimates for all causal queries identifiable with do-calculus.<n>Our empirical results on diverse settings show that DeCaFlow outperforms existing approaches, while demonstrating its out-of-the-box applicability to any given causal graph.
arXiv Detail & Related papers (2025-03-19T11:14:16Z) - Black Box Causal Inference: Effect Estimation via Meta Prediction [56.277798874118425]
We frame causal inference as a dataset-level prediction problem, offloading algorithm design to the learning process.
We introduce, called black box causal inference (BBCI), builds estimators in a black-box manner by learning to predict causal effects from sampled dataset-effect pairs.
We demonstrate accurate estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) with BBCI across several causal inference problems.
arXiv Detail & Related papers (2025-03-07T23:43:19Z) - Counterfactual Causal Inference in Natural Language with Large Language Models [9.153187514369849]
We propose an end-to-end causal structure discovery and causal inference method from natural language.
We first use an LLM to extract the instantiated causal variables from text data and build a causal graph.
We then conduct counterfactual inference on the estimated graph.
arXiv Detail & Related papers (2024-10-08T21:53:07Z) - CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series [4.008958683836471]
CAnDOIT is a causal discovery method to reconstruct causal models using both observational and interventional data.
The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics.
A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub.
arXiv Detail & Related papers (2024-10-03T13:57:08Z) - Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data [89.2410799619405]
We introduce the Quantitative Reasoning with Data benchmark to evaluate Large Language Models' capability in statistical and causal reasoning with real-world data.
The benchmark comprises a dataset of 411 questions accompanied by data sheets from textbooks, online learning materials, and academic papers.
To compare models' quantitative reasoning abilities on data and text, we enrich the benchmark with an auxiliary set of 290 text-only questions, namely QRText.
arXiv Detail & Related papers (2024-02-27T16:15:03Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Salesforce CausalAI Library: A Fast and Scalable Framework for Causal
Analysis of Time Series and Tabular Data [76.85310770921876]
We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data.
The goal of this library is to provide a fast and flexible solution for a variety of problems in the domain of causality.
arXiv Detail & Related papers (2023-01-25T22:42:48Z) - On the Identifiability and Estimation of Causal Location-Scale Noise
Models [122.65417012597754]
We study the class of location-scale or heteroscedastic noise models (LSNMs)
We show the causal direction is identifiable up to some pathological cases.
We propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks.
arXiv Detail & Related papers (2022-10-13T17:18:59Z) - Active Bayesian Causal Inference [72.70593653185078]
We propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning.
ABCI jointly infers a posterior over causal models and queries of interest.
We show that our approach is more data-efficient than several baselines that only focus on learning the full causal graph.
arXiv Detail & Related papers (2022-06-04T22:38:57Z) - DoWhy: Addressing Challenges in Expressing and Validating Causal
Assumptions [40.70930937915354]
DoWhy is a framework that allows explicit declaration of assumptions through a causal graph.
It provides multiple validation tests to check a subset of these assumptions.
Our experience with DoWhy highlights a number of open questions for future research.
arXiv Detail & Related papers (2021-08-27T11:07:30Z) - Algorithmic Causal Effect Identification with causaleffect [0.0]
This report is to review and implement in Python some algorithms to compute conditional and non-conditional causal queries from observational data.
We first present some basic background knowledge on probability and graph theory, before introducing important results on causal theory.
We then thoroughly study the identification algorithms presented by Shpitser and Pearl in 2006, explaining our implementation in Python alongside.
arXiv Detail & Related papers (2021-07-09T19:00:33Z) - A Causal Direction Test for Heterogeneous Populations [10.653162005300608]
Most causal models assume a single homogeneous population, an assumption that may fail to hold in many applications.
We show that when the homogeneity assumption is violated, causal models developed based on such assumption can fail to identify the correct causal direction.
We propose an adjustment to a commonly used causal direction test statistic by using a $k$-means type clustering algorithm.
arXiv Detail & Related papers (2020-06-08T18:59:14Z) - Showing Your Work Doesn't Always Work [73.63200097493576]
"Show Your Work: Improved Reporting of Experimental Results" advocates for reporting the expected validation effectiveness of the best-tuned model.
We analytically show that their estimator is biased and uses error-prone assumptions.
We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation.
arXiv Detail & Related papers (2020-04-28T17:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.