Identifiable Energy-based Representations: An Application to Estimating
Heterogeneous Causal Effects
- URL: http://arxiv.org/abs/2108.03039v1
- Date: Fri, 6 Aug 2021 10:39:49 GMT
- Title: Identifiable Energy-based Representations: An Application to Estimating
Heterogeneous Causal Effects
- Authors: Yao Zhang and Jeroen Berrevoets and Mihaela van der Schaar
- Abstract summary: Conditional average treatment effects (CATEs) allow us to understand the effect heterogeneity across a large population of individuals.
Typical CATE learners assume all confounding variables are measured in order for the CATE to be identifiable.
We propose an energy-based model (EBM) that learns a low-dimensional representation of the variables by employing a noise contrastive loss function.
- Score: 83.66276516095665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditional average treatment effects (CATEs) allow us to understand the
effect heterogeneity across a large population of individuals. However, typical
CATE learners assume all confounding variables are measured in order for the
CATE to be identifiable. Often, this requirement is satisfied by simply
collecting many variables, at the expense of increased sample complexity for
estimating CATEs. To combat this, we propose an energy-based model (EBM) that
learns a low-dimensional representation of the variables by employing a noise
contrastive loss function. With our EBM we introduce a preprocessing step that
alleviates the dimensionality curse for any existing model and learner
developed for estimating CATE. We prove that our EBM keeps the representations
partially identifiable up to some universal constant, as well as having
universal approximation capability to avoid excessive information loss from
model misspecification; these properties combined with our loss function,
enable the representations to converge and keep the CATE estimation consistent.
Experiments demonstrate the convergence of the representations, as well as show
that estimating CATEs on our representations performs better than on the
variables or the representations obtained via various benchmark dimensionality
reduction methods.
Related papers
- Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables.
We learn the causal Bayesian network and its confounding latent variables directly from the observational data.
We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z) - Causal Effect Estimation using identifiable Variational AutoEncoder with Latent Confounders and Post-Treatment Variables [18.34462010115951]
Estimating causal effects from observational data is challenging, especially in the presence of latent confounders.
We propose a novel method of joint Variational AutoEncoder (VAE) and identifiable Variational AutoEncoder (iVAE) for learning the representations of latent confounders and latent post-treatment variables.
arXiv Detail & Related papers (2024-08-13T22:13:25Z) - Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation [27.385663284378854]
State-of-the-art methods for conditional average treatment effect (CATE) estimation make widespread use of representation learning.
Here, the idea is to reduce the variance of the low-sample CATE estimation by a (potentially constrained) low-dimensional representation.
Low-dimensional representations can lose information about the observed confounders and thus lead to bias, because of which the validity of representation learning for CATE estimation is typically violated.
arXiv Detail & Related papers (2023-11-19T13:31:30Z) - Spectral Representation Learning for Conditional Moment Models [33.34244475589745]
We propose a procedure that automatically learns representations with controlled measures of ill-posedness.
Our method approximates a linear representation defined by the spectral decomposition of a conditional expectation operator.
We show this representation can be efficiently estimated from data, and establish L2 consistency for the resulting estimator.
arXiv Detail & Related papers (2022-10-29T07:48:29Z) - Identifying Weight-Variant Latent Causal Models [82.14087963690561]
We find that transitivity acts as a key role in impeding the identifiability of latent causal representations.
Under some mild assumptions, we can show that the latent causal representations can be identified up to trivial permutation and scaling.
We propose a novel method, termed Structural caUsAl Variational autoEncoder, which directly learns latent causal representations and causal relationships among them.
arXiv Detail & Related papers (2022-08-30T11:12:59Z) - Semi-Supervised Quantile Estimation: Robust and Efficient Inference in High Dimensional Settings [0.5735035463793009]
We consider quantile estimation in a semi-supervised setting, characterized by two available data sets.
We propose a family of semi-supervised estimators for the response quantile(s) based on the two data sets.
arXiv Detail & Related papers (2022-01-25T10:02:23Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.