Partial Identifiability in Discrete Data With Measurement Error
- URL: http://arxiv.org/abs/2012.12449v1
- Date: Wed, 23 Dec 2020 02:11:08 GMT
- Title: Partial Identifiability in Discrete Data With Measurement Error
- Authors: Noam Finkelstein, Roy Adams, Suchi Saria, Ilya Shpitser
- Abstract summary: We show that it is preferable to present bounds under justifiable assumptions than to pursue exact identification under dubious ones.
We use linear programming techniques to produce sharp bounds for factual and counterfactual distributions under measurement error.
- Score: 16.421318211327314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When data contains measurement errors, it is necessary to make assumptions
relating the observed, erroneous data to the unobserved true phenomena of
interest. These assumptions should be justifiable on substantive grounds, but
are often motivated by mathematical convenience, for the sake of exactly
identifying the target of inference. We adopt the view that it is preferable to
present bounds under justifiable assumptions than to pursue exact
identification under dubious ones. To that end, we demonstrate how a broad
class of modeling assumptions involving discrete variables, including common
measurement error and conditional independence assumptions, can be expressed as
linear constraints on the parameters of the model. We then use linear
programming techniques to produce sharp bounds for factual and counterfactual
distributions under measurement error in such models. We additionally propose a
procedure for obtaining outer bounds on non-linear models. Our method yields
sharp bounds in a number of important settings -- such as the instrumental
variable scenario with measurement error -- for which no bounds were previously
known.
Related papers
- Uncertainty Quantification of Surrogate Models using Conformal Prediction [7.445864392018774]
We formalise a conformal prediction framework that satisfies predictions in a model-agnostic manner, requiring near-zero computational costs.
The paper looks at providing statistically valid error bars for deterministic models, as well as crafting guarantees to the error bars of probabilistic models.
arXiv Detail & Related papers (2024-08-19T10:46:19Z) - Parameter uncertainties for imperfect surrogate models in the low-noise regime [0.3069335774032178]
We analyze the generalization error of misspecified, near-deterministic surrogate models.
We show posterior distributions must cover every training point to avoid a divergent generalization error.
This is demonstrated on model problems before application to thousand dimensional datasets in atomistic machine learning.
arXiv Detail & Related papers (2024-02-02T11:41:21Z) - User-defined Event Sampling and Uncertainty Quantification in Diffusion
Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems.
We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases.
We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z) - Sufficient Identification Conditions and Semiparametric Estimation under
Missing Not at Random Mechanisms [4.211128681972148]
Conducting valid statistical analyses is challenging in the presence of missing-not-at-random (MNAR) data.
We consider a MNAR model that generalizes several prior popular MNAR models in two ways.
We propose methods for testing the independence restrictions encoded in such models using odds ratio as our parameter of interest.
arXiv Detail & Related papers (2023-06-10T13:46:16Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Monotonicity and Double Descent in Uncertainty Estimation with Gaussian
Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.
We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension.
We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z) - Evaluating Aleatoric Uncertainty via Conditional Generative Models [15.494774321257939]
We study conditional generative models for aleatoric uncertainty estimation.
We introduce two metrics to measure the discrepancy between two conditional distributions.
We demonstrate numerically how our metrics provide correct measurements of conditional distributional discrepancies.
arXiv Detail & Related papers (2022-06-09T05:39:04Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Assumption-lean inference for generalised linear model parameters [0.0]
We propose nonparametric definitions of main effect estimands and effect modification estimands.
These reduce to standard main effect and effect modification parameters in generalised linear models when these models are correctly specified.
We achieve an assumption-lean inference for these estimands.
arXiv Detail & Related papers (2020-06-15T13:49:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.