From Guess2Graph: When and How Can Unreliable Experts Safely Boost Causal Discovery in Finite Samples?
- URL: http://arxiv.org/abs/2510.14488v1
- Date: Thu, 16 Oct 2025 09:31:44 GMT
- Title: From Guess2Graph: When and How Can Unreliable Experts Safely Boost Causal Discovery in Finite Samples?
- Authors: Sujai Hiremath, Dominik Janzing, Philipp Faller, Patrick Blöbaum, Elke Kirschbaum, Shiva Prasad Kasiviswanathan, Kyra Gan,
- Abstract summary: We propose the Guess2Graph framework, which uses expert guesses to guide the sequence of statistical tests rather than replacing them.<n>We develop two instantiations of G2G: PC-Guess, which augments the PC algorithm, and gPC-Guess, a learning-augmented variant designed to better leverage high-quality expert input.
- Score: 20.68174733590345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Causal discovery algorithms often perform poorly with limited samples. While integrating expert knowledge (including from LLMs) as constraints promises to improve performance, guarantees for existing methods require perfect predictions or uncertainty estimates, making them unreliable for practical use. We propose the Guess2Graph (G2G) framework, which uses expert guesses to guide the sequence of statistical tests rather than replacing them. This maintains statistical consistency while enabling performance improvements. We develop two instantiations of G2G: PC-Guess, which augments the PC algorithm, and gPC-Guess, a learning-augmented variant designed to better leverage high-quality expert input. Theoretically, both preserve correctness regardless of expert error, with gPC-Guess provably outperforming its non-augmented counterpart in finite samples when experts are "better than random." Empirically, both show monotonic improvement with expert accuracy, with gPC-Guess achieving significantly stronger gains.
Related papers
- Extracting Uncertainty Estimates from Mixtures of Experts for Semantic Segmentation [9.817102014355617]
We show that well-calibrated predictive uncertainty estimates can be extracted from a mixture of experts (MoE) without architectural modifications.<n>Our results show that MoEs yield more reliable uncertainty estimates than ensembles in terms of conditional correctness metrics.<n>Our experiments on the Cityscapes dataset suggest that increasing the number of experts can further enhance uncertainty calibration.
arXiv Detail & Related papers (2025-09-05T05:30:53Z) - Less Greedy Equivalence Search [52.818153111470394]
Greedy Equivalence Search (GES) is a score-based algorithm for causal discovery from observational data.<n>We develop Less Greedy Equivalence Search (LGES), a variant of GES that retains its theoretical guarantees while partially addressing these limitations.
arXiv Detail & Related papers (2025-06-27T15:39:48Z) - Robust and Computation-Aware Gaussian Processes [18.264598332579748]
We introduce Robust Computation-aware Gaussian Process (RCaGP), a novel GP model that combines a principled treatment of approximation-induced uncertainty with robust generalized Bayesian updating.<n>Our model ensures more conservative and reliable uncertainty estimates, a property we rigorously demonstrate.<n> Empirical results confirm that solving these challenges jointly leads to superior performance across both clean and outlier-contaminated settings.
arXiv Detail & Related papers (2025-05-27T12:49:14Z) - Learning to Defer for Causal Discovery with Imperfect Experts [59.071731337922664]
We propose L2D-CD, a method for gauging the correctness of expert recommendations and optimally combining them with data-driven causal discovery results.<n>We evaluate L2D-CD on the canonical T"ubingen pairs dataset and demonstrate its superior performance compared to both the causal discovery method and the expert used in isolation.
arXiv Detail & Related papers (2025-02-18T18:55:53Z) - Principled Bayesian Optimisation in Collaboration with Human Experts [23.988732776208053]
We consider a setup where experts provide advice through binary accept/reject recommendations (labels)
Experts' labels are often costly, requiring efficient use of their efforts, and can at the same time be unreliable.
We introduce the first principled approach that provides two key guarantees.
arXiv Detail & Related papers (2024-10-14T12:46:02Z) - Mixture of Weak & Strong Experts on Graphs [56.878757632521555]
Mixture of weak and strong experts (Mowst)
Mowst is easy to optimize and achieves strong expressive power.
On 4 backbone GNN architectures, Mowst shows significant accuracy improvement on 6 standard node classification benchmarks.
arXiv Detail & Related papers (2023-11-09T07:45:05Z) - On Preemption and Learning in Stochastic Scheduling [22.32180964593702]
We study single-machine scheduling of jobs belonging to a job type that determines its duration distribution.
We design algorithms that achieve sublinear excess cost, compared to the performance with known types, and prove lower bounds for the non-preemptive case.
arXiv Detail & Related papers (2022-05-31T11:19:32Z) - Sample-Efficient Optimisation with Probabilistic Transformer Surrogates [66.98962321504085]
This paper investigates the feasibility of employing state-of-the-art probabilistic transformers in Bayesian optimisation.
We observe two drawbacks stemming from their training procedure and loss definition, hindering their direct deployment as proxies in black-box optimisation.
We introduce two components: 1) a BO-tailored training prior supporting non-uniformly distributed points, and 2) a novel approximate posterior regulariser trading-off accuracy and input sensitivity to filter favourable stationary points for improved predictive performance.
arXiv Detail & Related papers (2022-05-27T11:13:17Z) - Trustworthy Long-Tailed Classification [41.45744960383575]
We propose a Trustworthy Long-tailed Classification (TLC) method to jointly conduct classification and uncertainty estimation.
Our TLC obtains the evidence-based uncertainty (EvU) and evidence for each expert, and then combines these uncertainties and evidences under the Dempster-Shafer Evidence Theory (DST)
The experimental results show that the proposed TLC outperforms the state-of-the-art methods and is trustworthy with reliable uncertainty.
arXiv Detail & Related papers (2021-11-17T10:52:36Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z) - Outlier-Robust Estimation: Hardness, Minimally Tuned Algorithms, and
Applications [25.222024234900445]
This paper introduces two unifying formulations for outlier-robust estimation, Generalized Maximum Consensus (G-MC) and Generalized Truncated Least Squares (G-TLS)
Our first contribution is a proof that outlier-robust estimation is inapproximable: in the worst case, it is impossible to (even approximately) find the set of outliers.
We propose the first minimally tuned algorithms for outlier rejection, that dynamically decide how to separate inliers from outliers.
arXiv Detail & Related papers (2020-07-29T21:06:13Z) - Uncertainty quantification using martingales for misspecified Gaussian
processes [52.22233158357913]
We address uncertainty quantification for Gaussian processes (GPs) under misspecified priors.
We construct a confidence sequence (CS) for the unknown function using martingale techniques.
Our CS is statistically valid and empirically outperforms standard GP methods.
arXiv Detail & Related papers (2020-06-12T17:58:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.