TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery
- URL: http://arxiv.org/abs/2506.01361v1
- Date: Mon, 02 Jun 2025 06:34:11 GMT
- Title: TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery
- Authors: Muhammad Hasan Ferdous, Emam Hossain, Md Osman Gani,
- Abstract summary: We introduce TimeGraph, a comprehensive suite of synthetic time-series benchmark datasets.<n>Each dataset is accompanied by a fully specified causal graph featuring varying densities and diverse noise distributions.<n>We demonstrate the utility of TimeGraph through systematic evaluations of state-of-the-art causal discovery algorithms.
- Score: 4.07304559469381
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust causal discovery in time series datasets depends on reliable benchmark datasets with known ground-truth causal relationships. However, such datasets remain scarce, and existing synthetic alternatives often overlook critical temporal properties inherent in real-world data, including nonstationarity driven by trends and seasonality, irregular sampling intervals, and the presence of unobserved confounders. To address these challenges, we introduce TimeGraph, a comprehensive suite of synthetic time-series benchmark datasets that systematically incorporates both linear and nonlinear dependencies while modeling key temporal characteristics such as trends, seasonal effects, and heterogeneous noise patterns. Each dataset is accompanied by a fully specified causal graph featuring varying densities and diverse noise distributions and is provided in two versions: one including unobserved confounders and one without, thereby offering extensive coverage of real-world complexity while preserving methodological neutrality. We further demonstrate the utility of TimeGraph through systematic evaluations of state-of-the-art causal discovery algorithms including PCMCI+, LPCMCI, and FGES across a diverse array of configurations and metrics. Our experiments reveal significant variations in algorithmic performance under realistic temporal conditions, underscoring the need for robust synthetic benchmarks in the fair and transparent assessment of causal discovery methods. The complete TimeGraph suite, including dataset generation scripts, evaluation metrics, and recommended experimental protocols, is freely available to facilitate reproducible research and foster community-driven advancements in time-series causal discovery.
Related papers
- ReTimeCausal: EM-Augmented Additive Noise Models for Interpretable Causal Discovery in Irregular Time Series [32.21736212737614]
This paper studies causal discovery in irregularly sampled time series in high-stakes domains like finance, healthcare, and climate science.<n>We propose ReTimeCausal, a novel integration of Additive Noise Models (ANM) and Expectation-Maximization (EM) that unifies physics-guided data imputation with sparse causal inference.
arXiv Detail & Related papers (2025-07-04T05:39:50Z) - Flow-Based Non-stationary Temporal Regime Causal Structure Learning [49.77103348208835]
We introduce FANTOM, a unified framework for causal discovery.<n>It handles non stationary processes along with non Gaussian and heteroscedastic noises.<n>It simultaneously infers the number of regimes and their corresponding indices and learns each regime's Directed Acyclic Graph.
arXiv Detail & Related papers (2025-06-20T15:12:43Z) - Multivariate Long-term Time Series Forecasting with Fourier Neural Filter [55.09326865401653]
We introduce FNF as the backbone and DBD as architecture to provide excellent learning capabilities and optimal learning pathways for spatial-temporal modeling.<n>We show that FNF unifies local time-domain and global frequency-domain information processing within a single backbone that extends naturally to spatial modeling.
arXiv Detail & Related papers (2025-06-10T18:40:20Z) - Temporal Causal-based Simulation for Realistic Time-series Generation [1.49201581313345]
Causal Discovery plays a pivotal role in revealing relationships among observed variables, particularly in the temporal setup.<n>Generation techniques depending on simplified assumptions on causal structure, effects and time, limit the quality and diversity of the simulated data.<n>We introduce Temporal Causal-based Simulation (TCS), a robust framework for generating realistic time-series data and their associated temporal causal graphs.
arXiv Detail & Related papers (2025-06-02T10:59:48Z) - Causal Discovery from Time-Series Data with Short-Term Invariance-Based Convolutional Neural Networks [12.784885649573994]
Causal discovery from time-series data aims to capture both intra-slice (contemporaneous) and inter-slice (time-lagged) causality.
We propose a novel gradient-based causal discovery approach STIC, which focuses on textbfShort-textbfTerm textbfInvariance using textbfConvolutional neural networks.
arXiv Detail & Related papers (2024-08-15T08:43:28Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - TimeGraphs: Graph-based Temporal Reasoning [64.18083371645956]
TimeGraphs is a novel approach that characterizes dynamic interactions as a hierarchical temporal graph.
Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales.
We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset.
arXiv Detail & Related papers (2024-01-06T06:26:49Z) - Causal Temporal Regime Structure Learning [49.77103348208835]
We present CASTOR, a novel method that concurrently learns the Directed Acyclic Graph (DAG) for each regime.<n>We establish the identifiability of the regimes and DAGs within our framework.<n>Experiments show that CASTOR consistently outperforms existing causal discovery models.
arXiv Detail & Related papers (2023-11-02T17:26:49Z) - CausalTime: Realistically Generated Time-series for Benchmarking of
Causal Discovery [14.092834149864514]
This study introduces the CausalTime pipeline to generate time-series that highly resemble the real data.
The pipeline starts from real observations in a specific scenario and produces a matching benchmark dataset.
In the experiments, we validate the fidelity of the generated data through qualitative and quantitative experiments, followed by a benchmarking of existing TSCD algorithms.
arXiv Detail & Related papers (2023-10-03T02:29:19Z) - Fully-Connected Spatial-Temporal Graph for Multivariate Time-Series Data [50.84488941336865]
We propose a novel method called Fully- Spatial-Temporal Graph Neural Network (FC-STGNN)
For graph construction, we design a decay graph to connect sensors across all timestamps based on their temporal distances.
For graph convolution, we devise FC graph convolution with a moving-pooling GNN layer to effectively capture the ST dependencies for learning effective representations.
arXiv Detail & Related papers (2023-09-11T08:44:07Z) - Continual Release of Differentially Private Synthetic Data from Longitudinal Data Collections [19.148874215745135]
We study the problem of continually releasing differentially private synthetic data from longitudinal data collections.
We introduce a model where, in every time step, each individual reports a new data element.
We give continual synthetic data generation algorithms that preserve two basic types of queries.
arXiv Detail & Related papers (2023-06-13T16:22:08Z) - DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and
Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors.
We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis.
We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z) - PIETS: Parallelised Irregularity Encoders for Forecasting with
Heterogeneous Time-Series [5.911865723926626]
Heterogeneity and irregularity of multi-source data sets present a significant challenge to time-series analysis.
In this work, we design a novel architecture, PIETS, to model heterogeneous time-series.
We show that PIETS is able to effectively model heterogeneous temporal data and outperforms other state-of-the-art approaches in the prediction task.
arXiv Detail & Related papers (2021-09-30T20:01:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.