Testing Goodness of Fit of Conditional Density Models with Kernels
- URL: http://arxiv.org/abs/2002.10271v2
- Date: Tue, 30 Jun 2020 15:27:09 GMT
- Title: Testing Goodness of Fit of Conditional Density Models with Kernels
- Authors: Wittawat Jitkrittum, Heishiro Kanagawa, Bernhard Sch\"olkopf
- Abstract summary: We propose two nonparametric statistical tests of goodness of fit for conditional distributions.
We show that our tests are consistent against any fixed alternative conditional model.
We demonstrate the interpretability of our test on a task of modeling the distribution of New York City's taxi drop-off location.
- Score: 16.003516725803774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose two nonparametric statistical tests of goodness of fit for
conditional distributions: given a conditional probability density function
$p(y|x)$ and a joint sample, decide whether the sample is drawn from
$p(y|x)r_x(x)$ for some density $r_x$. Our tests, formulated with a Stein
operator, can be applied to any differentiable conditional density model, and
require no knowledge of the normalizing constant. We show that 1) our tests are
consistent against any fixed alternative conditional model; 2) the statistics
can be estimated easily, requiring no density estimation as an intermediate
step; and 3) our second test offers an interpretable test result providing
insight on where the conditional model does not fit well in the domain of the
covariate. We demonstrate the interpretability of our test on a task of
modeling the distribution of New York City's taxi drop-off location given a
pick-up point. To our knowledge, our work is the first to propose such
conditional goodness-of-fit tests that simultaneously have all these desirable
properties.
Related papers
- General Frameworks for Conditional Two-Sample Testing [3.3317825075368908]
We study the problem of conditional two-sample testing, which aims to determine whether two populations have the same distribution after accounting for confounding factors.
This problem commonly arises in various applications, such as domain adaptation and algorithmic fairness.
We introduce two general frameworks that implicitly or explicitly target specific classes of distributions for their validity and power.
arXiv Detail & Related papers (2024-10-22T02:27:32Z) - Doubly Robust Conditional Independence Testing with Generative Neural Networks [8.323172773256449]
This article addresses the problem of testing the conditional independence of two generic random vectors $X$ and $Y$ given a third random vector $Z$.
We propose a new non-parametric testing procedure that avoids explicitly estimating any conditional distributions.
arXiv Detail & Related papers (2024-07-25T01:28:59Z) - A Conditional Independence Test in the Presence of Discretization [14.917729593550199]
Existing test methods can't work when only discretized observations are available.
We propose a conditional independence test specifically designed to accommodate the presence of such discretization.
arXiv Detail & Related papers (2024-04-26T18:08:15Z) - Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - Sobolev Space Regularised Pre Density Models [51.558848491038916]
We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density.
This method is statistically consistent, and makes the inductive validation model clear and consistent.
arXiv Detail & Related papers (2023-07-25T18:47:53Z) - User-defined Event Sampling and Uncertainty Quantification in Diffusion
Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems.
We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases.
We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z) - Nearest-Neighbor Sampling Based Conditional Independence Testing [15.478671471695794]
Conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z.
The aim of this paper is to develop a novel alternative of CRT by using nearest-neighbor sampling without assuming the exact form of the distribution of X given Z.
arXiv Detail & Related papers (2023-04-09T07:54:36Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - On the Generative Utility of Cyclic Conditionals [103.1624347008042]
We study whether and how can we model a joint distribution $p(x,z)$ using two conditional models $p(x|z)$ that form a cycle.
We propose the CyGen framework for cyclic-conditional generative modeling, including methods to enforce compatibility and use the determined distribution to fit and generate data.
arXiv Detail & Related papers (2021-06-30T10:23:45Z) - Hypothesis Testing for Equality of Latent Positions in Random Graphs [0.2741266294612775]
We consider the hypothesis testing problem that two vertices $i$ and $j$th have the same latent positions, possibly up to scaling.
We propose several test statistics based on the empirical Mahalanobis distances between the $i$th and $j$th rows of either the adjacency or the normalized Laplacian spectral embedding of the graph.
Using these test statistics, we address the model selection problem of choosing between the standard block model and its degree-corrected variant.
arXiv Detail & Related papers (2021-05-23T01:27:23Z) - Density of States Estimation for Out-of-Distribution Detection [69.90130863160384]
DoSE is the density of states estimator.
We demonstrate DoSE's state-of-the-art performance against other unsupervised OOD detectors.
arXiv Detail & Related papers (2020-06-16T16:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.