Competition is the key: A Game Theoretic Causal Discovery Approach
- URL: http://arxiv.org/abs/2510.20106v1
- Date: Thu, 23 Oct 2025 01:19:21 GMT
- Title: Competition is the key: A Game Theoretic Causal Discovery Approach
- Authors: Amartya Roy, Souvik Chakraborty,
- Abstract summary: We introduce a game-theoretic reinforcement learning framework for causal discovery.<n>A DDQN agent directly competes against a strong baseline (GES or GraN-DAG), always warm-starting from the opponent's solution.<n>This design yields three provable guarantees: the learned graph is never worse than the opponent, warm-starting strictly accelerates convergence, and most importantly, with high probability the algorithm selects the true best candidate graph.
- Score: 2.248800010440909
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Causal discovery remains a central challenge in machine learning, yet existing methods face a fundamental gap: algorithms like GES and GraN-DAG achieve strong empirical performance but lack finite-sample guarantees, while theoretically principled approaches fail to scale. We close this gap by introducing a game-theoretic reinforcement learning framework for causal discovery, where a DDQN agent directly competes against a strong baseline (GES or GraN-DAG), always warm-starting from the opponent's solution. This design yields three provable guarantees: the learned graph is never worse than the opponent, warm-starting strictly accelerates convergence, and most importantly, with high probability the algorithm selects the true best candidate graph. To the best of our knowledge, our result makes a first-of-its-kind progress in explaining such finite-sample guarantees in causal discovery: on synthetic SEMs (30 nodes), the observed error probability decays with n, tightly matching theory. On real-world benchmarks including Sachs, Asia, Alarm, Child, Hepar2, Dream, and Andes, our method consistently improves upon GES and GraN-DAG while remaining theoretically safe. Remarkably, it scales to large graphs such as Hepar2 (70 nodes), Dream (100 nodes), and Andes (220 nodes). Together, these results establish a new class of RL-based causal discovery algorithms that are simultaneously provably consistent, sample-efficient, and practically scalable, marking a decisive step toward unifying empirical performance with rigorous finite-sample theory.
Related papers
- HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG [53.30561659838455]
Large Language Models (LLMs) often struggle with inherent knowledge boundaries and hallucinations.<n>Retrieval-Augmented Generation (RAG) frequently overlooks structural interdependencies essential for multi-hop reasoning.<n>Help achieves competitive performance across multiple simple and multi-hop QA benchmarks and up to a 28.8$times$ speedup over leading Graph-based RAG baselines.
arXiv Detail & Related papers (2026-02-24T14:05:29Z) - Retrieving Classes of Causal Orders with Inconsistent Knowledge Bases [0.8192907805418583]
Large Language Models (LLMs) have emerged as a promising alternative for extracting causal knowledge from text-based metadata.<n>LLMs tend to be unreliable and prone to hallucinations, necessitating strategies that account for their limitations.<n>We present a new method to derive a class of acyclic tournaments, which represent plausible causal orders.
arXiv Detail & Related papers (2024-12-18T16:37:51Z) - A Full DAG Score-Based Algorithm for Learning Causal Bayesian Networks with Latent Confounders [0.0]
Causal Bayesian networks (CBN) are popular graphical probabilistic models that encode causal relations among variables.
This paper introduces the first fully score-based structure learning algorithm searching the space of DAGs that is capable of identifying the presence of some latent confounders.
arXiv Detail & Related papers (2024-08-20T20:25:56Z) - Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z) - Valid Inference After Causal Discovery [73.87055989355737]
We develop tools for valid post-causal-discovery inference.
We show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates.
arXiv Detail & Related papers (2022-08-11T17:40:45Z) - Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem.
We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z) - Uncoupled Bandit Learning towards Rationalizability: Benchmarks,
Barriers, and Algorithms [41.307340085194625]
We study the last-iterate convergence guarantee in general games toward rationalizability.
This learning task naturally generalizes best arm identification problems.
We develop a new algorithm that adjusts Exp3 with Diminishing Historical rewards.
arXiv Detail & Related papers (2021-11-10T02:10:07Z) - Intervention Efficient Algorithm for Two-Stage Causal MDPs [15.838256272508357]
We study Markov Decision Processes (MDP) wherein states correspond to causal graphs that generate rewards.
In this setup, the learner's goal is to identify atomic interventions that lead to high rewards by intervening on variables at each state.
Generalizing the recent causal-bandit framework, the current work develops (simple) regret minimization guarantees for two-stage causal MDPs.
arXiv Detail & Related papers (2021-11-01T12:22:37Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Efficient Robustness Certificates for Discrete Data: Sparsity-Aware
Randomized Smoothing for Graphs, Images and More [85.52940587312256]
We propose a model-agnostic certificate based on the randomized smoothing framework which subsumes earlier work and is tight, efficient, and sparsity-aware.
We show the effectiveness of our approach on a wide variety of models, datasets, and tasks -- specifically highlighting its use for Graph Neural Networks.
arXiv Detail & Related papers (2020-08-29T10:09:02Z) - Towards an Efficient and General Framework of Robust Training for Graph
Neural Networks [96.93500886136532]
Graph Neural Networks (GNNs) have made significant advances on several fundamental inference tasks.
Despite GNNs' impressive performance, it has been observed that carefully crafted perturbations on graph structures lead them to make wrong predictions.
We propose a general framework which leverages the greedy search algorithms and zeroth-order methods to obtain robust GNNs.
arXiv Detail & Related papers (2020-02-25T15:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.