On Different Notions of Redundancy in Conditional-Independence-Based Discovery of Graphical Models
- URL: http://arxiv.org/abs/2502.08531v1
- Date: Wed, 12 Feb 2025 16:08:48 GMT
- Title: On Different Notions of Redundancy in Conditional-Independence-Based Discovery of Graphical Models
- Authors: Philipp M. Faller, Dominik Janzing,
- Abstract summary: We show that due to the conciseness of the graphical representation, there are often many tests that are not used in the construction of the graph.<n>We show that not all tests contain this additional information and that such redundant tests have to be applied with care.
- Score: 11.384020227038961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of conditional-independence-based discovery of graphical models is to find a graph that represents the independence structure of variables in a given dataset. To learn such a representation, conditional-independence-based approaches conduct a set of statistical tests that suffices to identify the graphical representation under some assumptions on the underlying distribution of the data. In this work, we highlight that due to the conciseness of the graphical representation, there are often many tests that are not used in the construction of the graph. These redundant tests have the potential to detect or sometimes correct errors in the learned model. We show that not all tests contain this additional information and that such redundant tests have to be applied with care. Precisely, we argue that particularly those conditional (in)dependence statements are interesting that follow only from graphical assumptions but do not hold for every probability distribution.
Related papers
- Hypothesis Testing over Observable Regimes in Singular Models [0.12183405753834557]
We show that the fundamental obstruction to testing in singular statistical models is not singularity itself, but the formulation of hypotheses on non-identifiable parameter quantities.<n>We formalize this overlap obstruction and show that hypotheses depending on non-identifiable parameter functions necessarily fail in this sense.<n>In contrast, hypotheses formulated over identifiable observables-quantities that are determined by the induced distribution-reduce entirely to classical testing theory.
arXiv Detail & Related papers (2026-02-27T16:44:29Z) - A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z) - Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms [12.524536193679124]
We propose internal coherency scores that allow testing for assumption violations and finite sample errors.<n>We illustrate our coherency scores on the PC algorithm with simulated and real-world datasets.
arXiv Detail & Related papers (2025-02-20T16:44:54Z) - Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training.<n>We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches.<n>We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z) - Testing Dependency of Weighted Random Graphs [4.0554893636822]
We study the task of detecting the edge dependency between two random graphs.
For general edge-weight distributions, we establish thresholds at which optimal testing becomes information-theoretically possible or impossible.
arXiv Detail & Related papers (2024-09-23T10:07:41Z) - Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
We show that minimizing point-prediction losses does not guarantee proper learning of latent relational information.<n>We propose a sampling-based method that solves this joint learning task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z) - Towards Self-Interpretable Graph-Level Anomaly Detection [73.1152604947837]
Graph-level anomaly detection (GLAD) aims to identify graphs that exhibit notable dissimilarity compared to the majority in a collection.
We propose a Self-Interpretable Graph aNomaly dETection model ( SIGNET) that detects anomalous graphs as well as generates informative explanations simultaneously.
arXiv Detail & Related papers (2023-10-25T10:10:07Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection [67.90365841083951]
We develop a new graph contrastive learning framework GOOD-D for detecting OOD graphs without using any ground-truth labels.
GOOD-D is able to capture the latent ID patterns and accurately detect OOD graphs based on the semantic inconsistency in different granularities.
As a pioneering work in unsupervised graph-level OOD detection, we build a comprehensive benchmark to compare our proposed approach with different state-of-the-art methods.
arXiv Detail & Related papers (2022-11-08T12:41:58Z) - DAGAD: Data Augmentation for Graph Anomaly Detection [57.92471847260541]
This paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs.
A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics.
arXiv Detail & Related papers (2022-10-18T11:28:21Z) - Null Hypothesis Test for Anomaly Detection [0.0]
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis.
By testing for statistical independence of the two discriminating dataset regions, we are able exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions.
arXiv Detail & Related papers (2022-10-05T13:03:55Z) - AZ-whiteness test: a test for uncorrelated noise on spatio-temporal
graphs [19.407150082045636]
We present the first whiteness test for graphs, i.e., a serial whiteness test for a serial time series associated with the nodes of a graph.
We show how the test can be employed to assess the quality-temporal forecasting models by analyzing the prediction residuals to the graphs stream.
arXiv Detail & Related papers (2022-04-23T19:43:19Z) - Generalization bounds for learning under graph-dependence: A survey [4.220336689294245]
We explore learning scenarios where examples are dependent and their dependence relationship is described by a dependency graph.
We collect various graph-dependent concentration bounds, which are then used to derive Rademacher complexity and stability bounds generalization.
To our knowledge, this survey is the first of this kind on this subject.
arXiv Detail & Related papers (2022-03-25T09:33:49Z) - Model-agnostic out-of-distribution detection using combined statistical
tests [15.27980070479021]
We present simple methods for out-of-distribution detection using a trained generative model.
We combine a classical parametric test (Rao's score test) with the recently introduced typicality test.
Despite their simplicity and generality, these methods can be competitive with model-specific out-of-distribution detection algorithms.
arXiv Detail & Related papers (2022-03-02T13:32:09Z) - Learning Invariant Representations with Missing Data [18.307438471163774]
Models that satisfy particular independencies involving correlation-inducing textitnuisance variables have guarantees on their test performance.
We derive acrshortmmd estimators used for invariance objectives under missing nuisances.
On simulations and clinical data, optimizing through these estimates achieves test performance similar to using estimators that make use of the full data.
arXiv Detail & Related papers (2021-12-01T23:14:34Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Typing assumptions improve identification in causal discovery [123.06886784834471]
Causal discovery from observational data is a challenging task to which an exact solution cannot always be identified.
We propose a new set of assumptions that constrain possible causal relationships based on the nature of the variables.
arXiv Detail & Related papers (2021-07-22T14:23:08Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Density of States Estimation for Out-of-Distribution Detection [69.90130863160384]
DoSE is the density of states estimator.
We demonstrate DoSE's state-of-the-art performance against other unsupervised OOD detectors.
arXiv Detail & Related papers (2020-06-16T16:06:25Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z) - Out-of-Sample Representation Learning for Multi-Relational Graphs [8.956321788625894]
We study the out-of-sample representation learning problem for non-attributed knowledge graphs.
We create benchmark datasets for this task, develop several models and baselines, and provide empirical analyses and comparisons of the proposed models and baselines.
arXiv Detail & Related papers (2020-04-28T00:53:01Z) - Full Law Identification In Graphical Models Of Missing Data:
Completeness Results [13.299431908881425]
We provide the first completeness result in this field of study.
We then address issues that may arise due to the presence of both missing data and unmeasured confounding.
arXiv Detail & Related papers (2020-04-10T01:31:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.