General sample size analysis for probabilities of causation: a delta method approach
- URL: http://arxiv.org/abs/2602.17070v1
- Date: Thu, 19 Feb 2026 04:25:36 GMT
- Title: General sample size analysis for probabilities of causation: a delta method approach
- Authors: Tianyuan Cheng, Ruirui Mao, Judea Pearl, Ang Li,
- Abstract summary: We propose a general sample size framework based on the delta method.<n>Our approach applies to settings in which the target bounds of PoCs can be expressed as finite minima or maxima of linear combinations of experimental and observational probabilities.
- Score: 12.153332840370998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probabilities of causation (PoCs), such as the probability of necessity and sufficiency (PNS), are important tools for decision making but are generally not point identifiable. Existing work has derived bounds for these quantities using combinations of experimental and observational data. However, there is very limited research on sample size analysis, namely, how many experimental and observational samples are required to achieve a desired margin of error. In this paper, we propose a general sample size framework based on the delta method. Our approach applies to settings in which the target bounds of PoCs can be expressed as finite minima or maxima of linear combinations of experimental and observational probabilities. Through simulation studies, we demonstrate that the proposed sample size calculations lead to stable estimation of these bounds.
Related papers
- Flow-Based Density Ratio Estimation for Intractable Distributions with Applications in Genomics [80.05951561886123]
We leverage condition-aware flow matching to derive a single dynamical formulation for tracking density ratios along generative trajectories.<n>We demonstrate competitive performance on simulated benchmarks for closed-form ratio estimation, and show that our method supports versatile tasks in single-cell genomics data analysis.
arXiv Detail & Related papers (2026-02-27T17:27:55Z) - Efficient Covariance Estimation for Sparsified Functional Data [51.69796254617083]
proposed Random-knots (Random-knots-Spatial) and B-spline (Bspline-Spatial) estimators of the covariance function are computationally efficient.<n>Asymptotic pointwise of the covariance are obtained for sparsified individual trajectories under some regularity conditions.
arXiv Detail & Related papers (2025-11-23T00:50:33Z) - Cross-Validated Causal Inference: a Modern Method to Combine Experimental and Observational Data [48.72384067821617]
We develop new methods to integrate experimental and observational data in causal inference.<n>A full model containing the causal parameter is obtained by minimizing a weighted combination of experimental and observational losses.<n>Experiments on real and synthetic data show the efficacy and reliability of our method.
arXiv Detail & Related papers (2025-11-01T22:24:16Z) - Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming [0.0]
We develop a probabilistic method for assessing the tail behavior and geometric stability of one-dimensional i.i.d. samples.<n>We derive analytical expressions, including finite-sample corrections, for the expected shrinkage under both the uniform and Gaussian hypotheses.<n>We further integrate our criterion into a clustering pipeline (e.g. DBSCAN), demonstrating its ability to validate one-dimensional clusters without any density estimation or parameter tuning.
arXiv Detail & Related papers (2025-08-29T21:52:15Z) - Invariant Causal Prediction with Local Models [52.161513027831646]
We consider the task of identifying the causal parents of a target variable among a set of candidates from observational data.
We introduce a practical method called L-ICP ($textbfL$ocalized $textbfI$nvariant $textbfCa$usal $textbfP$rediction), which is based on a hypothesis test for parent identification using a ratio of minimum and maximum statistics.
arXiv Detail & Related papers (2024-01-10T15:34:42Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Detecting Adversarial Data by Probing Multiple Perturbations Using
Expected Perturbation Score [62.54911162109439]
Adversarial detection aims to determine whether a given sample is an adversarial one based on the discrepancy between natural and adversarial distributions.
We propose a new statistic called expected perturbation score (EPS), which is essentially the expected score of a sample after various perturbations.
We develop EPS-based maximum mean discrepancy (MMD) as a metric to measure the discrepancy between the test sample and natural samples.
arXiv Detail & Related papers (2023-05-25T13:14:58Z) - Probabilities of Causation: Adequate Size of Experimental and
Observational Samples [17.565045120151865]
Tian and Pearl derived sharp bounds for the probability of necessity and sufficiency (PNS), the probability of sufficiency (PS), and the probability of necessity (PN) using experimental and observational data.
The assumption is that one is in possession of a large enough sample to permit an accurate estimation of the experimental and observational distributions.
We present a method for determining the sample size needed for such estimation, when a given confidence interval (CI) is specified.
arXiv Detail & Related papers (2022-10-10T21:59:49Z) - Bayesian nonparametric estimation of coverage probabilities and distinct
counts from sketched data [6.510507449705344]
We propose a nonparametric methodology to estimate coverage probabilities from data sketched through random hashing.
The proposed Bayesian estimators are shown to be easily applicable to large-scale analyses in combination with a Dirichlet process prior.
The empirical effectiveness of our methodology is demonstrated through numerical experiments and applications to real data sets of Covid DNA sequences, classic English literature, and IP addresses.
arXiv Detail & Related papers (2022-09-05T20:48:04Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - On a Variational Approximation based Empirical Likelihood ABC Method [1.5293427903448025]
We propose an easy-to-use empirical likelihood ABC method in this article.
We show that the target log-posterior can be approximated as a sum of an expected joint log-likelihood and the differential entropy of the data generating density.
arXiv Detail & Related papers (2020-11-12T21:24:26Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z) - Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial
Sampling [9.66840768820136]
inverse binomial sampling (IBS) can estimate the log-likelihood of an entire data set efficiently and without bias.
IBS produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods.
arXiv Detail & Related papers (2020-01-12T19:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.