Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints
- URL: http://arxiv.org/abs/2602.23800v1
- Date: Fri, 27 Feb 2026 08:40:17 GMT
- Title: Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints
- Authors: Tadahisa Okuda, Shohei Shimizu, Thong Pham, Tatsuyoshi Ikenoue, Shingo Fukuma,
- Abstract summary: Causal discovery has achieved substantial theoretical progress, yet its deployment in longitudinal systems remains limited.<n>We describe a workflow-induced constraint class for longitudinal causal discovery that restricts the admissible directed acyclic graph space.<n>We show that explicitly encoding workflow-consistent partial orders reduces structural ambiguity.
- Score: 2.593291716183273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal discovery has achieved substantial theoretical progress, yet its deployment in large-scale longitudinal systems remains limited. A key obstacle is that operational data are generated under institutional workflows whose induced partial orders are rarely formalized, enlarging the admissible graph space in ways inconsistent with the recording process. We characterize a workflow-induced constraint class for longitudinal causal discovery that restricts the admissible directed acyclic graph space through protocol-derived structural masks and timeline-aligned indexing. Rather than introducing a new optimization algorithm, we show that explicitly encoding workflow-consistent partial orders reduces structural ambiguity, especially in mixed discrete--continuous panels where within-time orientation is weakly identified. The framework combines workflow-derived admissible-edge constraints, measurement-aligned time indexing and block structure, bootstrap-based uncertainty quantification for lagged total effects, and a dynamic representation supporting intervention queries. In a nationwide annual health screening cohort in Japan with 107,261 individuals and 429,044 person-years, workflow-constrained longitudinal LiNGAM yields temporally consistent within-time substructures and interpretable lagged total effects with explicit uncertainty. Sensitivity analyses using alternative exposure and body-composition definitions preserve the main qualitative patterns. We argue that formalizing workflow-derived constraint classes improves structural interpretability without relying on domain-specific edge specification, providing a reproducible bridge between operational workflows and longitudinal causal discovery under standard identifiability assumptions.
Related papers
- On Multi-Step Theorem Prediction via Non-Parametric Structural Priors [50.16583672681106]
In this work, we explore training-free theorem prediction through the lens of in-context learning (ICL)<n>We propose Theorem Precedence Graphs, which encode temporal dependencies from historical solution traces as directed graphs, and impose explicit topological constraints that effectively prune the search space during inference.<n>Experiments on the FormalGeo7k benchmark show that our method achieves 89.29% accuracy, substantially outperforming ICL baselines and matching state-of-the-art supervised models.
arXiv Detail & Related papers (2026-03-05T06:08:50Z) - LHAW: Controllable Underspecification for Long-Horizon Tasks [8.46227536869596]
We introduce LHAW (Long-Horizon Augmenteds), a modular, dataset-agnostic synthetic pipeline that transforms any well-specified task into controllable underspecified variants.<n>Unlike approaches that rely on LLM predictions of ambiguity, LHAW validates variants through empirical agent trials, classifying them as outcome-critical, divergent, or benign based on observed terminal state divergence.<n>We release 285 task variants from TheAgentCompany, SWE-Bench Pro and MCP-Atlas, measuring how current agents detect, reason about, and resolve underspecification across ambiguous settings.
arXiv Detail & Related papers (2026-02-11T04:49:50Z) - Intrinsic Stability Limits of Autoregressive Reasoning: Structural Consequences for Long-Horizon Execution [0.0]
Large language models (LLMs) demonstrate remarkable reasoning capabilities, yet their performance often deteriorates sharply in long-horizon tasks.<n>We propose that the fundamental constraint on long-horizon reasoning arises from process-level instability in autoregressive generation.<n>Our findings suggest new limitations on maintaining long-term coherence under purely autoregressive architectures.
arXiv Detail & Related papers (2026-02-06T06:11:06Z) - GLOW: Graph-Language Co-Reasoning for Agentic Workflow Performance Prediction [51.83437071408662]
We propose GLOW, a unified framework for AW performance prediction.<n>GLOW combines the graph-structure modeling capabilities of GNNs with the reasoning power of LLMs.<n>Experiments on FLORA-Bench show that GLOW outperforms state-of-the-art baselines in prediction accuracy and ranking utility.
arXiv Detail & Related papers (2025-12-11T13:30:46Z) - Provable Benefit of Curriculum in Transformer Tree-Reasoning Post-Training [76.12556589212666]
We show that curriculum post-training avoids the exponential complexity bottleneck.<n>Under outcome-only reward signals, reinforcement learning finetuning achieves high accuracy with sample complexity.<n>We establish guarantees for test-time scaling, where curriculum-aware querying reduces both reward oracle calls and sampling cost from exponential to order.
arXiv Detail & Related papers (2025-11-10T18:29:54Z) - R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling [0.0]
We present R-ConstraintBench, a framework that evaluates models on Resource-Constrained Project Scheduling Problems (RCPSP)<n>We instantiate the benchmark in a data center migration setting and evaluate multiple LLMs using feasibility and error analysis.<n> Empirically, strong models are near-ceiling on precedence-only DAGs, but feasibility performance collapses when downtime, temporal windows, and disjunctive constraints interact.
arXiv Detail & Related papers (2025-08-21T03:35:58Z) - READER: Retrieval-Assisted Drafter for Efficient LLM Inference [0.0386965802948046]
Autoregressive Language Models instantiate a factorized likelihood over token sequences, yet their strictly sequential decoding process imposes an intrinsic lower bound on latency inference.<n>This bottleneck has emerged as a central obstacle to the scalable deployment of large-scale generative models.<n>We present READER, a speculative decoding framework that bypasses the training of the auxiliary draft model.
arXiv Detail & Related papers (2025-08-12T16:47:48Z) - Enforcing Hard Linear Constraints in Deep Learning Models with Decision Rules [8.098452803458253]
This paper proposes a model-agnostic framework for enforcing input-dependent linear equality and inequality constraints on neural network outputs.<n>The architecture combines a task network trained for prediction accuracy with a safe network trained using decision rules from the runtime and robust optimization to ensure feasibility across the entire input space.
arXiv Detail & Related papers (2025-05-20T03:09:44Z) - Long Context In-Context Compression by Getting to the Gist of Gisting [50.24627831994713]
GistPool is an in-context compression method with no architectural modification to the decoder transformer.<n>We demonstrate that gisting struggles with longer contexts, with significant performance drops even at minimal compression rates.<n>GistPool preserves the simplicity of gisting, while significantly boosting its performance on long context compression tasks.
arXiv Detail & Related papers (2025-04-11T19:23:31Z) - On the Identification of Temporally Causal Representation with Instantaneous Dependence [50.14432597910128]
Temporally causal representation learning aims to identify the latent causal process from time series observations.
Most methods require the assumption that the latent causal processes do not have instantaneous relations.
We propose an textbfIDentification framework for instantanetextbfOus textbfLatent dynamics.
arXiv Detail & Related papers (2024-05-24T08:08:05Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.