Intervention Generalization: A View from Factor Graph Models
- URL: http://arxiv.org/abs/2306.04027v2
- Date: Thu, 9 Nov 2023 00:11:09 GMT
- Title: Intervention Generalization: A View from Factor Graph Models
- Authors: Gecia Bravo-Hermsdorff, David S. Watson, Jialin Yu, Jakob Zeitler, and
Ricardo Silva
- Abstract summary: We take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system.
A postulated $textitinterventional factor model$ (IFM) may not always be informative, but it conveniently abstracts away a need for explicitly modeling unmeasured confounding and feedback mechanisms.
- Score: 7.117681268784223
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the goals of causal inference is to generalize from past experiments
and observational data to novel conditions. While it is in principle possible
to eventually learn a mapping from a novel experimental condition to an outcome
of interest, provided a sufficient variety of experiments is available in the
training data, coping with a large combinatorial space of possible
interventions is hard. Under a typical sparse experimental design, this mapping
is ill-posed without relying on heavy regularization or prior distributions.
Such assumptions may or may not be reliable, and can be hard to defend or test.
In this paper, we take a close look at how to warrant a leap from past
experiments to novel conditions based on minimal assumptions about the
factorization of the distribution of the manipulated system, communicated in
the well-understood language of factor graph models. A postulated
$\textit{interventional factor model}$ (IFM) may not always be informative, but
it conveniently abstracts away a need for explicitly modeling unmeasured
confounding and feedback mechanisms, leading to directly testable claims. Given
an IFM and datasets from a collection of experimental regimes, we derive
conditions for identifiability of the expected outcomes of new regimes never
observed in these training data. We implement our framework using several
efficient algorithms, and apply them on a range of semi-synthetic experiments.
Related papers
- Prediction-powered Generalization of Causal Inferences [6.43357871718189]
We show how the limited size of trials makes generalization a statistically infeasible task.
We develop generalization algorithms that supplement the trial data with a prediction model learned from an additional observational study.
arXiv Detail & Related papers (2024-06-05T02:44:14Z) - Demystifying amortized causal discovery with transformers [21.058343547918053]
Supervised learning approaches for causal discovery from observational data often achieve competitive performance.
In this work, we investigate CSIvA, a transformer-based model promising to train on synthetic data and transfer to real data.
We bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations.
arXiv Detail & Related papers (2024-05-27T08:17:49Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Testing for Overfitting [0.0]
We discuss the overfitting problem and explain why standard and concentration results do not hold for evaluation with training data.
We introduce and argue for a hypothesis test by means of which both model performance may be evaluated using training data.
arXiv Detail & Related papers (2023-05-09T22:49:55Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Dense Out-of-Distribution Detection by Robust Learning on Synthetic
Negative Data [1.7474352892977458]
We show how to detect out-of-distribution anomalies in road-driving scenes and remote sensing imagery.
We leverage a jointly trained normalizing flow due to coverage-oriented learning objective and the capability to generate samples at different resolutions.
The resulting models set the new state of the art on benchmarks for out-of-distribution detection in road-driving scenes and remote sensing imagery.
arXiv Detail & Related papers (2021-12-23T20:35:10Z) - Predicting Unreliable Predictions by Shattering a Neural Network [145.3823991041987]
Piecewise linear neural networks can be split into subfunctions.
Subfunctions have their own activation pattern, domain, and empirical error.
Empirical error for the full network can be written as an expectation over subfunctions.
arXiv Detail & Related papers (2021-06-15T18:34:41Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.