Multi-task Causal Learning with Gaussian Processes
- URL: http://arxiv.org/abs/2009.12821v1
- Date: Sun, 27 Sep 2020 11:33:40 GMT
- Title: Multi-task Causal Learning with Gaussian Processes
- Authors: Virginia Aglietti, Theodoros Damoulas, Mauricio \'Alvarez, Javier
Gonz\'alez
- Abstract summary: This paper studies the problem of learning the correlation structure of a set of intervention functions defined on the directed acyclic graph (DAG) of a causal model.
We propose the first multi-task causal Gaussian process (GP) model, which allows for information sharing across continuous interventions and experiments on different variables.
- Score: 17.205106391379026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the problem of learning the correlation structure of a set
of intervention functions defined on the directed acyclic graph (DAG) of a
causal model. This is useful when we are interested in jointly learning the
causal effects of interventions on different subsets of variables in a DAG,
which is common in field such as healthcare or operations research. We propose
the first multi-task causal Gaussian process (GP) model, which we call DAG-GP,
that allows for information sharing across continuous interventions and across
experiments on different variables. DAG-GP accommodates different assumptions
in terms of data availability and captures the correlation between functions
lying in input spaces of different dimensionality via a well-defined integral
operator. We give theoretical results detailing when and how the DAG-GP model
can be formulated depending on the DAG. We test both the quality of its
predictions and its calibrated uncertainties. Compared to single-task models,
DAG-GP achieves the best fitting performance in a variety of real and synthetic
settings. In addition, it helps to select optimal interventions faster than
competing approaches when used within sequential decision making frameworks,
like active learning or Bayesian optimization.
Related papers
- Toward the Identifiability of Comparative Deep Generative Models [7.5479347719819865]
We propose a theory of identifiability for comparative Deep Generative Models (DGMs)
We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine.
We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance.
arXiv Detail & Related papers (2024-01-29T06:10:54Z) - Latent Variable Multi-output Gaussian Processes for Hierarchical
Datasets [0.8057006406834466]
Multi-output Gaussian processes (MOGPs) have been introduced to deal with multiple tasks by exploiting the correlations between different outputs.
This paper proposes an extension of MOGPs for hierarchical datasets.
arXiv Detail & Related papers (2023-08-31T15:52:35Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Discovering Dynamic Causal Space for DAG Structure Learning [64.763763417533]
We propose a dynamic causal space for DAG structure learning, coined CASPER.
It integrates the graph structure into the score function as a new measure in the causal space to faithfully reflect the causal distance between estimated and ground truth DAG.
arXiv Detail & Related papers (2023-06-05T12:20:40Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - BCDAG: An R package for Bayesian structure and Causal learning of
Gaussian DAGs [77.34726150561087]
We introduce the R package for causal discovery and causal effect estimation from observational data.
Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, the number of variables in the dataset.
We then illustrate the main functions and algorithms on both real and simulated datasets.
arXiv Detail & Related papers (2022-01-28T09:30:32Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Scalable Gaussian Processes for Data-Driven Design using Big Data with
Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses.
We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously.
Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z) - Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems.
A theory has shown the importance of the gradient descent (GD) to globally optimal solutions.
We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z) - Modulating Scalable Gaussian Processes for Expressive Statistical
Learning [25.356503463916816]
Gaussian process (GP) is interested in learning the statistical relationship between inputs and outputs, since it offers not only the prediction mean but also the associated variability.
This article studies new scalable GP paradigms including the non-stationary heteroscedastic GP, the mixture of GPs and the latent GP, which introduce additional latent variables to modulate the outputs or inputs in order to learn richer, non-Gaussian statistical representation.
arXiv Detail & Related papers (2020-08-29T06:41:45Z) - Semiparametric Inference For Causal Effects In Graphical Models With
Hidden Variables [13.299431908881425]
Identification theory for causal effects in causal models associated with hidden variable directed acyclic graphs is well studied.
corresponding algorithms are underused due to the complexity of estimating the identifying functionals they output.
We bridge the gap between identification and estimation of population-level causal effects involving a single treatment and a single outcome.
arXiv Detail & Related papers (2020-03-27T22:29:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.