Learning Infomax and Domain-Independent Representations for Causal
Effect Inference with Real-World Data
- URL: http://arxiv.org/abs/2202.10885v1
- Date: Tue, 22 Feb 2022 13:35:15 GMT
- Title: Learning Infomax and Domain-Independent Representations for Causal
Effect Inference with Real-World Data
- Authors: Zhixuan Chu, Stephen Rathbun, Sheng Li
- Abstract summary: We learn the Infomax and Domain-Independent Representations to solve the above puzzles.
We show that our method achieves state-of-the-art performance on causal effect inference.
- Score: 9.601837205635686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The foremost challenge to causal inference with real-world data is to handle
the imbalance in the covariates with respect to different treatment options,
caused by treatment selection bias. To address this issue, recent literature
has explored domain-invariant representation learning based on different domain
divergence metrics (e.g., Wasserstein distance, maximum mean discrepancy,
position-dependent metric, and domain overlap). In this paper, we reveal the
weaknesses of these strategies, i.e., they lead to the loss of predictive
information when enforcing the domain invariance; and the treatment effect
estimation performance is unstable, which heavily relies on the characteristics
of the domain distributions and the choice of domain divergence metrics.
Motivated by information theory, we propose to learn the Infomax and
Domain-Independent Representations to solve the above puzzles. Our method
utilizes the mutual information between the global feature representations and
individual feature representations, and the mutual information between feature
representations and treatment assignment predictions, in order to maximally
capture the common predictive information for both treatment and control
groups. Moreover, our method filters out the influence of instrumental and
irrelevant variables, and thus it effectively increases the predictive ability
of potential outcomes. Experimental results on both the synthetic and
real-world datasets show that our method achieves state-of-the-art performance
on causal effect inference. Moreover, our method exhibits reliable prediction
performances when facing data with different characteristics of data
distributions, complicated variable types, and severe covariate imbalance.
Related papers
- Deriving Causal Order from Single-Variable Interventions: Guarantees & Algorithm [14.980926991441345]
We show that datasets containing interventional data can be effectively extracted under realistic assumptions about the data distribution.
We introduce interventional faithfulness, which relies on comparisons between the marginal distributions of each variable across observational and interventional settings.
We also introduce Intersort, an algorithm designed to infer the causal order from datasets containing large numbers of single-variable interventions.
arXiv Detail & Related papers (2024-05-28T16:07:17Z) - Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift [9.387706860375461]
A distribution shift occurs when the underlying data-generating process changes, leading to a deviation in the model's performance.
The prediction interval serves as a crucial tool for characterizing uncertainties induced by their underlying distribution.
We propose methodologies for aggregating prediction intervals to obtain one with minimal width and adequate coverage on the target domain.
arXiv Detail & Related papers (2024-05-16T17:55:42Z) - Disentangle Estimation of Causal Effects from Cross-Silo Data [14.684584362172666]
We introduce an innovative disentangle architecture designed to facilitate the seamless cross-silo transmission of model parameters.
We introduce global constraints into the equation to effectively mitigate bias within the various missing domains.
Our method outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-01-04T09:05:37Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Self-balanced Learning For Domain Generalization [64.99791119112503]
Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics.
Most existing approaches have been developed under the assumption that the source data is well-balanced in terms of both domain and class.
We propose a self-balanced domain generalization framework that adaptively learns the weights of losses to alleviate the bias caused by different distributions of the multi-domain source data.
arXiv Detail & Related papers (2021-08-31T03:17:54Z) - Robust Bayesian Inference for Discrete Outcomes with the Total Variation
Distance [5.139874302398955]
Models of discrete-valued outcomes are easily misspecified if the data exhibit zero-inflation, overdispersion or contamination.
Here, we introduce a robust discrepancy-based Bayesian approach using the Total Variation Distance (TVD)
We empirically demonstrate that our approach is robust and significantly improves predictive performance on a range of simulated and real world data.
arXiv Detail & Related papers (2020-10-26T09:53:06Z) - Matching in Selective and Balanced Representation Space for Treatment
Effects Estimation [10.913802831701082]
We propose a feature selection representation matching (FSRM) method based on deep representation learning and matching.
We evaluate the performance of our FSRM method on three datasets, and the results demonstrate superiority over the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-15T02:07:34Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z) - Learning Overlapping Representations for the Estimation of
Individualized Treatment Effects [97.42686600929211]
Estimating the likely outcome of alternatives from observational data is a challenging problem.
We show that algorithms that learn domain-invariant representations of inputs are often inappropriate.
We develop a deep kernel regression algorithm and posterior regularization framework that substantially outperforms the state-of-the-art on a variety of benchmarks data sets.
arXiv Detail & Related papers (2020-01-14T12:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.