Related papers: HOLOGRAPH: Active Causal Discovery via Sheaf-Theoretic Alignment of Large Language Model Priors

HOLOGRAPH: Active Causal Discovery via Sheaf-Theoretic Alignment of Large Language Model Priors

URL: http://arxiv.org/abs/2512.24478v2
Date: Fri, 02 Jan 2026 19:55:58 GMT
Title: HOLOGRAPH: Active Causal Discovery via Sheaf-Theoretic Alignment of Large Language Model Priors
Authors: Hyunjun Kim,
Abstract summary: HOLOGRAPH is a framework that formalizes Large Language Models-guided causal discovery.<n>Our key insight is that coherent global causal structure corresponds to the existence of a global section.<n> Experiments on synthetic and real-world benchmarks demonstrate that HOLOGRAPH provides rigorous mathematical foundations.
Score: 12.969042037563971
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Causal discovery from observational data remains fundamentally limited by identifiability constraints. Recent work has explored leveraging Large Language Models (LLMs) as sources of prior causal knowledge, but existing approaches rely on heuristic integration that lacks theoretical grounding. We introduce HOLOGRAPH, a framework that formalizes LLM-guided causal discovery through sheaf theory--representing local causal beliefs as sections of a presheaf over variable subsets. Our key insight is that coherent global causal structure corresponds to the existence of a global section, while topological obstructions manifest as non-vanishing sheaf cohomology. We propose the Algebraic Latent Projection to handle hidden confounders and Natural Gradient Descent on the belief manifold for principled optimization. Experiments on synthetic and real-world benchmarks demonstrate that HOLOGRAPH provides rigorous mathematical foundations while achieving competitive performance on causal discovery tasks with 50-100 variables. Our sheaf-theoretic analysis reveals that while Identity, Transitivity, and Gluing axioms are satisfied to numerical precision (<10^{-6}), the Locality axiom fails for larger graphs, suggesting fundamental non-local coupling in latent variable projections. Code is available at [https://github.com/hyunjun1121/holograph](https://github.com/hyunjun1121/holograph).

Related papers

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors [50.16583672681106]
In this work, we explore training-free theorem prediction through the lens of in-context learning (ICL)<n>We propose Theorem Precedence Graphs, which encode temporal dependencies from historical solution traces as directed graphs, and impose explicit topological constraints that effectively prune the search space during inference.<n>Experiments on the FormalGeo7k benchmark show that our method achieves 89.29% accuracy, substantially outperforming ICL baselines and matching state-of-the-art supervised models.
arXiv Detail & Related papers (2026-03-05T06:08:50Z)
The Causal Round Trip: Generating Authentic Counterfactuals by Eliminating Information Loss [4.166536642958902]
We introduce BELM-MDCM, the first diffusion-based framework engineered to be causally sound by eliminating the Structural Reconstruction Error (SRE)<n>Our work reconciles the power of modern generative models with the rigor of classical causal theory.
arXiv Detail & Related papers (2025-11-07T13:37:23Z)
TopInG: Topologically Interpretable Graph Learning via Persistent Rationale Filtration [10.830399323047265]
We propose TopInG: Topologically Interpretable Graph Learning, a novel framework to identify persistent rationale subgraphs.<n>TopInG employs a rationale filtration learning approach to model an autoregressive generation process of rationale subgraphs.<n>Our approach improves upon state-of-the-art methods on both predictive accuracy and interpretation quality.
arXiv Detail & Related papers (2025-10-06T17:59:44Z)
HypoChainer: A Collaborative System Combining LLMs and Knowledge Graphs for Hypothesis-Driven Scientific Discovery [4.020865072189471]
We propose HypoChainer, a visualization framework that integrates human expertise, knowledge graphs, and reasoning.<n> HypoChainer operates in three stages: First, exploration and contextualization -- experts use retrieval-augmented LLMs (RAGs) and dimensionality reduction.<n>Second, hypothesis chain formation -- experts iteratively examine KG relationships around predictions and semantically linked entities.<n>Third, validation prioritization -- refined hypotheses are filtered based on KG-supported evidence to identify high-priority candidates for experimentation.
arXiv Detail & Related papers (2025-07-23T05:02:54Z)
Can Large Language Models Help Experimental Design for Causal Discovery? [94.66802142727883]
Large Language Model Guided Intervention Targeting (LeGIT) is a robust framework that effectively incorporates LLMs to augment existing numerical approaches for the intervention targeting in causal discovery.<n>LeGIT demonstrates significant improvements and robustness over existing methods and even surpasses humans.
arXiv Detail & Related papers (2025-03-03T03:43:05Z)
Retrieving Classes of Causal Orders with Inconsistent Knowledge Bases [0.8192907805418583]
Large Language Models (LLMs) have emerged as a promising alternative for extracting causal knowledge from text-based metadata.<n>LLMs tend to be unreliable and prone to hallucinations, necessitating strategies that account for their limitations.<n>We present a new method to derive a class of acyclic tournaments, which represent plausible causal orders.
arXiv Detail & Related papers (2024-12-18T16:37:51Z)
Graph Stochastic Neural Process for Inductive Few-shot Knowledge Graph Completion [63.68647582680998]
We focus on a task called inductive few-shot knowledge graph completion (I-FKGC) Inspired by the idea of inductive reasoning, we cast I-FKGC as an inductive reasoning problem. We present a neural process-based hypothesis extractor that models the joint distribution of hypothesis, from which we can sample a hypothesis for predictions. In the second module, based on the hypothesis, we propose a graph attention-based predictor to test if the triple in the query set aligns with the extracted hypothesis.
arXiv Detail & Related papers (2024-08-03T13:37:40Z)
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models [70.01883340129204]
spatial reasoning is a crucial component of both biological and artificial intelligence. We present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
arXiv Detail & Related papers (2024-06-07T01:06:34Z)
Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data. We determine the types of distribution shifts that do contribute to the identifiability of causal representations. We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z)
Discovering and Reasoning of Causality in the Hidden World with Large Language Models [109.62442253177376]
We develop a new framework termed Causal representatiOn AssistanT (COAT) to propose useful measured variables for causal discovery.<n>Instead of directly inferring causality with Large language models (LLMs), COAT constructs feedback from intermediate causal discovery results to LLMs to refine the proposed variables.
arXiv Detail & Related papers (2024-02-06T12:18:54Z)
Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data.<n>One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z)
Integrating Large Language Model for Improved Causal Discovery [25.50313039584238]
Large Language Models (LLM) have been used for causal analysis across various domain-specific scenarios.<n>We propose an error-tolerant LLM-driven causal discovery framework.
arXiv Detail & Related papers (2023-06-29T12:48:00Z)
Causal Discovery in Linear Structural Causal Models with Deterministic Relations [27.06618125828978]
We focus on the task of causal discovery form observational data. We derive a set of necessary and sufficient conditions for unique identifiability of the causal structure.
arXiv Detail & Related papers (2021-10-30T21:32:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.