Towards Practical Multi-label Causal Discovery in High-Dimensional Event Sequences via One-Shot Graph Aggregation
- URL: http://arxiv.org/abs/2509.19112v1
- Date: Tue, 23 Sep 2025 14:58:50 GMT
- Title: Towards Practical Multi-label Causal Discovery in High-Dimensional Event Sequences via One-Shot Graph Aggregation
- Authors: Hugo Math, Rainer Lienhart,
- Abstract summary: CARGO is a scalable multi-label causal discovery method for sparse, high-dimensional event sequences.<n>It infers in parallel, per sequence one-shot causal graphs and aggregates them using an adaptive frequency fusion to reconstruct the global Markov boundaries of labels.<n>Our results on a challenging real-world automotive fault prediction dataset with over 29,100 unique event types and 474 imbalanced labels demonstrate CARGO's ability to perform structured reasoning.
- Score: 14.409508347156397
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Understanding causality in event sequences where outcome labels such as diseases or system failures arise from preceding events like symptoms or error codes is critical. Yet remains an unsolved challenge across domains like healthcare or vehicle diagnostics. We introduce CARGO, a scalable multi-label causal discovery method for sparse, high-dimensional event sequences comprising of thousands of unique event types. Using two pretrained causal Transformers as domain-specific foundation models for event sequences. CARGO infers in parallel, per sequence one-shot causal graphs and aggregates them using an adaptive frequency fusion to reconstruct the global Markov boundaries of labels. This two-stage approach enables efficient probabilistic reasoning at scale while bypassing the intractable cost of full-dataset conditional independence testing. Our results on a challenging real-world automotive fault prediction dataset with over 29,100 unique event types and 474 imbalanced labels demonstrate CARGO's ability to perform structured reasoning.
Related papers
- TRACE: Scalable Amortized Causal Discovery from Single Sequences via Autoregressive Density Estimation [14.409508347156397]
We study causal discovery from a single observed sequence of discrete events generated by a process.<n>We introduce TRACE, a scalable framework that repurposes autoregressive models as pretrained density estimators for conditional mutual information estimation.
arXiv Detail & Related papers (2026-02-01T10:18:27Z) - One-Shot Multi-Label Causal Discovery in High-Dimensional Event Sequences [20.072624123275528]
We present OSCAR, a one-shot causal autoregressive method that infers per-sequence Markov Boundaries.<n>On a real-world automotive dataset with 29,100 events and 474 labels, OSCAR recovers interpretable causal structures in minutes.
arXiv Detail & Related papers (2025-09-27T09:49:26Z) - DeCaFlow: A Deconfounding Causal Generative Model [58.411886466157185]
We introduce DeCaFlow, a deconfounding causal generative model.<n>We extend previous results on causal estimation under hidden confounding to show that a single instance of DeCaFlow provides correct estimates for all causal queries identifiable with do-calculus.<n>Our empirical results on diverse settings show that DeCaFlow outperforms existing approaches, while demonstrating its out-of-the-box applicability to any given causal graph.
arXiv Detail & Related papers (2025-03-19T11:14:16Z) - Detecting Anomalous Events in Object-centric Business Processes via
Graph Neural Networks [55.583478485027]
This study proposes a novel framework for anomaly detection in business processes.
We first reconstruct the process dependencies of the object-centric event logs as attributed graphs.
We then employ a graph convolutional autoencoder architecture to detect anomalous events.
arXiv Detail & Related papers (2024-02-14T14:17:56Z) - Abnormal Event Detection via Hypergraph Contrastive Learning [54.80429341415227]
Abnormal event detection plays an important role in many real applications.
In this paper, we study the unsupervised abnormal event detection problem in Attributed Heterogeneous Information Network.
A novel hypergraph contrastive learning method, named AEHCL, is proposed to fully capture abnormal event patterns.
arXiv Detail & Related papers (2023-04-02T08:23:20Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Granger Causal Chain Discovery for Sepsis-Associated Derangements via
Continuous-Time Hawkes Processes [10.410454851418548]
We develop a scalable two-phase gradient-based method to obtain a maximum surrogate-likelihood estimator.
Our method is extended to a data set of patients admitted to Grady hospital system in Atlanta, GA, USA, where the estimated GC graph identifies several highly interpretable GC chains that precede sepsis.
arXiv Detail & Related papers (2022-09-09T18:21:30Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Anomaly Detection for Aggregated Data Using Multi-Graph Autoencoder [21.81622481466591]
We focus on creating an Anomaly detection models for system logs.
We present a thorough analysis of the aggregated data and the relationships between aggregated events.
We propose Multiple-graphs autoencoder MGAE, a novel convolutional graphs-autoencoder model.
arXiv Detail & Related papers (2021-01-11T17:38:42Z) - Recomposition vs. Prediction: A Novel Anomaly Detection for Discrete
Events Based On Autoencoder [5.781280693720236]
One of the most challenging problems in the field of intrusion detection is anomaly detection for discrete event logs.
We propose DabLog, a Deep Autoencoder-Based anomaly detection method for discrete event Logs.
Our approach determines whether a sequence is normal or abnormal by analyzing (encoding) and reconstructing (decoding) the given sequence.
arXiv Detail & Related papers (2020-12-27T16:31:05Z) - Multi-Scale One-Class Recurrent Neural Networks for Discrete Event
Sequence Anomaly Detection [63.825781848587376]
We propose OC4Seq, a one-class recurrent neural network for detecting anomalies in discrete event sequences.
Specifically, OC4Seq embeds the discrete event sequences into latent spaces, where anomalies can be easily detected.
arXiv Detail & Related papers (2020-08-31T04:48:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.