Score-based Generative Modeling for Conditional Independence Testing
- URL: http://arxiv.org/abs/2505.23309v1
- Date: Thu, 29 May 2025 10:10:46 GMT
- Title: Score-based Generative Modeling for Conditional Independence Testing
- Authors: Yixin Ren, Chenghou Jin, Yewei Xia, Li Ke, Longtao Huang, Hui Xue, Hao Zhang, Jihong Guan, Shuigeng Zhou,
- Abstract summary: We propose a novel CI testing method via score-based generative modeling, which achieves precise Type I error control and strong testing power.<n>We theoretically establish the error bound of conditional distributions modeled by score-based generative models and prove the validity of our CI tests.
- Score: 35.0533359302886
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determining conditional independence (CI) relationships between random variables is a fundamental yet challenging task in machine learning and statistics, especially in high-dimensional settings. Existing generative model-based CI testing methods, such as those utilizing generative adversarial networks (GANs), often struggle with undesirable modeling of conditional distributions and training instability, resulting in subpar performance. To address these issues, we propose a novel CI testing method via score-based generative modeling, which achieves precise Type I error control and strong testing power. Concretely, we first employ a sliced conditional score matching scheme to accurately estimate conditional score and use Langevin dynamics conditional sampling to generate null hypothesis samples, ensuring precise Type I error control. Then, we incorporate a goodness-of-fit stage into the method to verify generated samples and enhance interpretability in practice. We theoretically establish the error bound of conditional distributions modeled by score-based generative models and prove the validity of our CI tests. Extensive experiments on both synthetic and real-world datasets show that our method significantly outperforms existing state-of-the-art methods, providing a promising way to revitalize generative model-based CI testing.
Related papers
- Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments [5.5855749614100825]
This paper addresses the hypothesis that leveraging multiple pre-trained models can mitigate this recall reduction.<n>We formulate the challenge of identifying and managing conflicting predictions from various models as a consistency-based abduction problem.<n>Our results validate the use of consistency-based abduction as an effective mechanism to robustly integrate knowledge from multiple imperfect reasoners in challenging, novel scenarios.
arXiv Detail & Related papers (2025-05-25T23:17:47Z) - On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - Deep anytime-valid hypothesis testing [29.273915933729057]
We propose a general framework for constructing powerful, sequential hypothesis tests for nonparametric testing problems.
We develop a principled approach of leveraging the representation capability of machine learning models within the testing-by-betting framework.
Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines.
arXiv Detail & Related papers (2023-10-30T09:46:19Z) - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces.
We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z) - Can You Rely on Your Model Evaluation? Improving Model Evaluation with
Synthetic Test Data [75.20035991513564]
We introduce 3S Testing, a deep generative modeling framework to facilitate model evaluation.
Our experiments demonstrate that 3S Testing outperforms traditional baselines.
These results raise the question of whether we need a paradigm shift away from limited real test data towards synthetic test data.
arXiv Detail & Related papers (2023-10-25T10:18:44Z) - Learning to Increase the Power of Conditional Randomization Tests [8.883733362171032]
The model-X conditional randomization test is a generic framework for conditional independence testing.
We introduce novel model-fitting schemes that are designed to explicitly improve the power of model-X tests.
arXiv Detail & Related papers (2022-07-03T12:29:25Z) - A Simple Unified Approach to Testing High-Dimensional Conditional
Independences for Categorical and Ordinal Data [0.26651200086513094]
Conditional independence (CI) tests underlie many approaches to model testing and structure learning in causal inference.
Most existing CI tests for categorical and ordinal data stratify the sample by the conditioning variables, perform simple independence tests in each stratum, and combine the results.
Here we propose a simple unified CI test for ordinal and categorical data that maintains reasonable calibration and power in high dimensions.
arXiv Detail & Related papers (2022-06-09T08:56:12Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Calibrating Over-Parametrized Simulation Models: A Framework via
Eligibility Set [3.862247454265944]
We develop a framework to develop calibration schemes that satisfy rigorous frequentist statistical guarantees.
We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
arXiv Detail & Related papers (2021-05-27T00:59:29Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.