Can Transformers be Strong Treatment Effect Estimators?
- URL: http://arxiv.org/abs/2202.01336v2
- Date: Fri, 4 Feb 2022 06:31:01 GMT
- Title: Can Transformers be Strong Treatment Effect Estimators?
- Authors: Yi-Fan Zhang, Hanlin Zhang, Zachary C. Lipton, Li Erran Li, Eric P.
Xing
- Abstract summary: We develop a general framework based on the Transformer architecture to address a variety of treatment effect estimation problems.
Our methods are applied to discrete, continuous, structured, or dosage-associated treatments.
Our experiments with Transformers as Treatment Effect Estimators (TransTEE) demonstrate that these inductive biases are also effective on the sorts of estimation problems and datasets that arise in research aimed at estimating causal effects.
- Score: 86.32484218657166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop a general framework based on the Transformer
architecture to address a variety of challenging treatment effect estimation
(TEE) problems. Our methods are applicable both when covariates are tabular and
when they consist of sequences (e.g., in text), and can handle discrete,
continuous, structured, or dosage-associated treatments. While Transformers
have already emerged as dominant methods for diverse domains, including natural
language and computer vision, our experiments with Transformers as Treatment
Effect Estimators (TransTEE) demonstrate that these inductive biases are also
effective on the sorts of estimation problems and datasets that arise in
research aimed at estimating causal effects. Moreover, we propose a propensity
score network that is trained with TransTEE in an adversarial manner to promote
independence between covariates and treatments to further address selection
bias. Through extensive experiments, we show that TransTEE significantly
outperforms competitive baselines with greater parameter efficiency over a wide
range of benchmarks and settings.
Related papers
- Higher-Order Causal Message Passing for Experimentation with Complex Interference [6.092214762701847]
We introduce a new class of estimators based on causal message-passing, specifically designed for settings with pervasive, unknown interference.
Our estimator draws on information from the sample mean and variance of unit outcomes and treatments over time, enabling efficient use of observed data.
arXiv Detail & Related papers (2024-11-01T18:00:51Z) - Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
Unlike RCTs, indirect experiments estimate treatment effects by leveragingconditional instrumental variables.
In this paper we take the initial steps towards enhancing sample efficiency for indirect experiments by adaptively designing a data collection policy.
Our main contribution is a practical computational procedure that utilizes influence functions to search for an optimal data collection policy.
arXiv Detail & Related papers (2023-12-05T02:38:04Z) - How inter-rater variability relates to aleatoric and epistemic
uncertainty: a case study with deep learning-based paraspinal muscle
segmentation [1.9624082208594296]
We study how inter-rater variability affects the reliability of the resulting deep learning algorithms.
Our study reveals the interplay between inter-rater variability and uncertainties, affected by choices of label fusion strategies and DL models.
arXiv Detail & Related papers (2023-08-14T06:40:20Z) - Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order.
In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z) - Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects
Estimation [103.55894890759376]
This paper introduces several building blocks that use representation learning to handle the heterogeneous feature spaces.
We show how these building blocks can be used to recover transfer learning equivalents of the standard CATE learners.
arXiv Detail & Related papers (2022-10-08T16:41:02Z) - CETransformer: Casual Effect Estimation via Transformer Based
Representation Learning [17.622007687796756]
Data-driven causal effect estimation faces two main challenges, i.e., selection bias and the missing of counterfactual.
To address these two issues, most of the existing approaches tend to reduce the selection bias by learning a balanced representation.
We propose a CETransformer model for casual effect estimation via transformer based representation learning.
arXiv Detail & Related papers (2021-07-19T09:39:57Z) - On Inductive Biases for Heterogeneous Treatment Effect Estimation [91.3755431537592]
We investigate how to exploit structural similarities of an individual's potential outcomes (POs) under different treatments.
We compare three end-to-end learning strategies to overcome this problem.
arXiv Detail & Related papers (2021-06-07T16:30:46Z) - Translational Equivariance in Kernelizable Attention [3.236198583140341]
We show how translational equivariance can be implemented in efficient Transformers based on kernelizable attention.
Our experiments highlight that the devised approach significantly improves robustness of Performers to shifts of input images.
arXiv Detail & Related papers (2021-02-15T17:14:15Z) - Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.
We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.