Causal Inference from Small High-dimensional Datasets
- URL: http://arxiv.org/abs/2205.09281v1
- Date: Thu, 19 May 2022 02:04:01 GMT
- Title: Causal Inference from Small High-dimensional Datasets
- Authors: Raquel Aoki and Martin Ester
- Abstract summary: Causal-Batle is a methodology to estimate treatment effects in small high-dimensional datasets.
We adopt an approach that brings transfer learning techniques into causal inference.
- Score: 7.1894784995284144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many methods have been proposed to estimate treatment effects with
observational data. Often, the choice of the method considers the application's
characteristics, such as type of treatment and outcome, confounding effect, and
the complexity of the data. These methods implicitly assume that the sample
size is large enough to train such models, especially the neural network-based
estimators. What if this is not the case? In this work, we propose
Causal-Batle, a methodology to estimate treatment effects in small
high-dimensional datasets in the presence of another high-dimensional dataset
in the same feature space. We adopt an approach that brings transfer learning
techniques into causal inference. Our experiments show that such an approach
helps to bring stability to neural network-based methods and improve the
treatment effect estimates in small high-dimensional datasets.
Related papers
- C-HDNet: A Fast Hyperdimensional Computing Based Method for Causal Effect Estimation from Networked Observational Data [2.048226951354646]
We consider the problem of estimating causal effects from observational data in the presence of network confounding.
We propose a novel matching technique which leverages hyperdimensional computing to model network information and improve predictive performance.
arXiv Detail & Related papers (2025-01-27T23:12:18Z) - Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training.
We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO.
As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z) - Estimating Conditional Average Treatment Effects via Sufficient Representation Learning [31.822980052107496]
This paper proposes a novel neural network approach named textbfCrossNet to learn a sufficient representation for the features, based on which we then estimate the conditional average treatment effects (CATE)
Numerical simulations and empirical results demonstrate that our method outperforms the competitive approaches.
arXiv Detail & Related papers (2024-08-30T07:23:59Z) - Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment
Effect Estimation [137.3520153445413]
A notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference.
We evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets.
The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes.
arXiv Detail & Related papers (2023-07-11T02:58:10Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Neuroevolutionary Feature Representations for Causal Inference [0.0]
We propose a novel approach for learning feature representations to aid the estimation of the conditional average treatment effect or CATE.
Our method focuses on an intermediate layer in a neural network trained to predict the outcome from the features.
arXiv Detail & Related papers (2022-05-21T09:13:04Z) - Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects [82.20189909620899]
Estimating heterogeneous treatment effects is an important problem across many domains.
Currently, most existing works rely exclusively on observational data.
We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
arXiv Detail & Related papers (2022-02-25T18:59:54Z) - Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer
Treatment-Effects from Observational Data [37.15330590319357]
Existing approaches rely on fitting deep models on outcomes observed for treated and control populations.
Deep Bayesian active learning provides a framework for efficient data acquisition by selecting points with high uncertainty.
We introduce causal, Bayesian acquisition functions grounded in information theory that bias data acquisition towards regions with overlapping support.
arXiv Detail & Related papers (2021-11-03T15:11:39Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Efficient Multidimensional Functional Data Analysis Using Marginal
Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data.
We show that the resulting estimation problem can be solved efficiently by the tensor decomposition.
We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.