TRACE: Scalable Amortized Causal Discovery from Single Sequences via Autoregressive Density Estimation
- URL: http://arxiv.org/abs/2602.01135v1
- Date: Sun, 01 Feb 2026 10:18:27 GMT
- Title: TRACE: Scalable Amortized Causal Discovery from Single Sequences via Autoregressive Density Estimation
- Authors: Hugo Math, Rainer Lienhart,
- Abstract summary: We study causal discovery from a single observed sequence of discrete events generated by a process.<n>We introduce TRACE, a scalable framework that repurposes autoregressive models as pretrained density estimators for conditional mutual information estimation.
- Score: 14.409508347156397
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study causal discovery from a single observed sequence of discrete events generated by a stochastic process, as encountered in vehicle logs, manufacturing systems, or patient trajectories. This regime is particularly challenging due to the absence of repeated samples, high dimensionality, and long-range temporal dependencies of the single observation during inference. We introduce TRACE, a scalable framework that repurposes autoregressive models as pretrained density estimators for conditional mutual information estimation. TRACE infers the summary causal graph between event types in a sequence, scaling linearly with the event vocabulary and supporting delayed causal effects, while being fully parallel on GPUs. We establish its theoretical identifiability under imperfect autoregressive models. Experiments demonstrate robust performance across different baselines and varying vocabulary sizes including an application to root-cause analysis in vehicle diagnostics with over 29,100 event types.
Related papers
- How Focused Are LLMs? A Quantitative Study via Repetitive Deterministic Prediction Tasks [0.9338697277815541]
We investigate the performance of large language models on repetitive deterministic prediction tasks.<n>Our experiments reveal a sharp double exponential drop beyond a characteristic length scale.<n>This indicates that the models fail to execute each operation independently.
arXiv Detail & Related papers (2025-11-02T01:42:08Z) - Towards Practical Multi-label Causal Discovery in High-Dimensional Event Sequences via One-Shot Graph Aggregation [14.409508347156397]
CARGO is a scalable multi-label causal discovery method for sparse, high-dimensional event sequences.<n>It infers in parallel, per sequence one-shot causal graphs and aggregates them using an adaptive frequency fusion to reconstruct the global Markov boundaries of labels.<n>Our results on a challenging real-world automotive fault prediction dataset with over 29,100 unique event types and 474 imbalanced labels demonstrate CARGO's ability to perform structured reasoning.
arXiv Detail & Related papers (2025-09-23T14:58:50Z) - Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models.<n>For the first time, that order of generation can successfully be predicted by the denoising network itself.<n>Using these findings, we can increase the accuracy of specific reasoning tasks from 1% to >50%.
arXiv Detail & Related papers (2025-02-28T14:08:30Z) - Large-Scale Targeted Cause Discovery via Learning from Simulated Data [66.51307552703685]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.<n>We train a neural network using supervised learning on simulated data to infer causality.<n> Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - Sample, estimate, aggregate: A recipe for causal discovery foundation models [28.116832159265964]
Causal discovery has the potential to uncover mechanistic insights from biological experiments.<n>We propose a supervised model trained on large-scale, synthetic data to predict causal graphs.<n>Our approach is enabled by the observation that typical errors in the outputs of a discovery algorithm remain comparable across datasets.
arXiv Detail & Related papers (2024-02-02T21:57:58Z) - MLAD: A Unified Model for Multi-system Log Anomaly Detection [35.68387377240593]
We propose MLAD, a novel anomaly detection model that incorporates semantic relational reasoning across multiple systems.
Specifically, we employ Sentence-bert to capture the similarities between log sequences and convert them into highly-dimensional learnable semantic vectors.
We revamp the formulas of the Attention layer to discern the significance of each keyword in the sequence and model the overall distribution of the multi-system dataset.
arXiv Detail & Related papers (2024-01-15T12:51:13Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Causal Temporal Regime Structure Learning [49.77103348208835]
We present CASTOR, a novel method that concurrently learns the Directed Acyclic Graph (DAG) for each regime.<n>We establish the identifiability of the regimes and DAGs within our framework.<n>Experiments show that CASTOR consistently outperforms existing causal discovery models.
arXiv Detail & Related papers (2023-11-02T17:26:49Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - A Multi-Channel Neural Graphical Event Model with Negative Evidence [76.51278722190607]
Event datasets are sequences of events of various types occurring irregularly over the time-line.
We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions.
arXiv Detail & Related papers (2020-02-21T23:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.