CETransformer: Casual Effect Estimation via Transformer Based
Representation Learning
- URL: http://arxiv.org/abs/2107.08714v1
- Date: Mon, 19 Jul 2021 09:39:57 GMT
- Title: CETransformer: Casual Effect Estimation via Transformer Based
Representation Learning
- Authors: Zhenyu Guo, Shuai Zheng, Zhizhe Liu, Kun Yan, Zhenfeng Zhu
- Abstract summary: Data-driven causal effect estimation faces two main challenges, i.e., selection bias and the missing of counterfactual.
To address these two issues, most of the existing approaches tend to reduce the selection bias by learning a balanced representation.
We propose a CETransformer model for casual effect estimation via transformer based representation learning.
- Score: 17.622007687796756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Treatment effect estimation, which refers to the estimation of causal effects
and aims to measure the strength of the causal relationship, is of great
importance in many fields but is a challenging problem in practice. As present,
data-driven causal effect estimation faces two main challenges, i.e., selection
bias and the missing of counterfactual. To address these two issues, most of
the existing approaches tend to reduce the selection bias by learning a
balanced representation, and then to estimate the counterfactual through the
representation. However, they heavily rely on the finely hand-crafted metric
functions when learning balanced representations, which generally doesn't work
well for the situations where the original distribution is complicated. In this
paper, we propose a CETransformer model for casual effect estimation via
transformer based representation learning. To learn the representation of
covariates(features) robustly, a self-supervised transformer is proposed, by
which the correlation between covariates can be well exploited through
self-attention mechanism. In addition, an adversarial network is adopted to
balance the distribution of the treated and control groups in the
representation space. Experimental results on three real-world datasets
demonstrate the advantages of the proposed CETransformer, compared with the
state-of-the-art treatment effect estimation methods.
Related papers
- Moderately-Balanced Representation Learning for Treatment Effects with
Orthogonality Information [14.040918087553177]
Estimating the average treatment effect (ATE) from observational data is challenging due to selection bias.
We propose a moderately-balanced representation learning framework.
This framework protects the representation from being over-balanced via multi-task learning.
arXiv Detail & Related papers (2022-09-05T13:20:12Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - Can Transformers be Strong Treatment Effect Estimators? [86.32484218657166]
We develop a general framework based on the Transformer architecture to address a variety of treatment effect estimation problems.
Our methods are applied to discrete, continuous, structured, or dosage-associated treatments.
Our experiments with Transformers as Treatment Effect Estimators (TransTEE) demonstrate that these inductive biases are also effective on the sorts of estimation problems and datasets that arise in research aimed at estimating causal effects.
arXiv Detail & Related papers (2022-02-02T23:56:42Z) - Transformer Uncertainty Estimation with Hierarchical Stochastic
Attention [8.95459272947319]
We propose a novel way to enable transformers to have the capability of uncertainty estimation.
This is achieved by learning a hierarchical self-attention that attends to values and a set of learnable centroids.
We empirically evaluate our model on two text classification tasks with both in-domain (ID) and out-of-domain (OOD) datasets.
arXiv Detail & Related papers (2021-12-27T16:43:31Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Matching in Selective and Balanced Representation Space for Treatment
Effects Estimation [10.913802831701082]
We propose a feature selection representation matching (FSRM) method based on deep representation learning and matching.
We evaluate the performance of our FSRM method on three datasets, and the results demonstrate superiority over the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-15T02:07:34Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z) - Understanding Adversarial Examples from the Mutual Influence of Images
and Perturbations [83.60161052867534]
We analyze adversarial examples by disentangling the clean images and adversarial perturbations, and analyze their influence on each other.
Our results suggest a new perspective towards the relationship between images and universal perturbations.
We are the first to achieve the challenging task of a targeted universal attack without utilizing original training data.
arXiv Detail & Related papers (2020-07-13T05:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.