Data Generating Process to Evaluate Causal Discovery Techniques for Time
Series Data
- URL: http://arxiv.org/abs/2104.08043v1
- Date: Fri, 16 Apr 2021 11:38:29 GMT
- Title: Data Generating Process to Evaluate Causal Discovery Techniques for Time
Series Data
- Authors: Andrew R. Lawrence, Marcus Kaiser, Rui Sampaio, Maksim Sipos
- Abstract summary: We propose a framework for developing, evaluating, and benchmarking time series causal discovery methods.
The framework can be used to fine tune novel methods on vast amounts of data, without "overfitting" them to a benchmark.
Using our framework, we evaluate prominent time series causal discovery methods and demonstrate a notable degradation in performance when their assumptions are invalidated.
- Score: 1.5293427903448025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Going beyond correlations, the understanding and identification of causal
relationships in observational time series, an important subfield of Causal
Discovery, poses a major challenge. The lack of access to a well-defined ground
truth for real-world data creates the need to rely on synthetic data for the
evaluation of these methods. Existing benchmarks are limited in their scope, as
they either are restricted to a "static" selection of data sets, or do not
allow for a granular assessment of the methods' performance when commonly made
assumptions are violated. We propose a flexible and simple to use framework for
generating time series data, which is aimed at developing, evaluating, and
benchmarking time series causal discovery methods. In particular, the framework
can be used to fine tune novel methods on vast amounts of data, without
"overfitting" them to a benchmark, but rather so they perform well in
real-world use cases. Using our framework, we evaluate prominent time series
causal discovery methods and demonstrate a notable degradation in performance
when their assumptions are invalidated and their sensitivity to choice of
hyperparameters. Finally, we propose future research directions and how our
framework can support both researchers and practitioners.
Related papers
- Beyond Data Scarcity: A Frequency-Driven Framework for Zero-Shot Forecasting [15.431513584239047]
Time series forecasting is critical in numerous real-world applications.
Traditional forecasting techniques struggle when data is scarce or not available at all.
Recent advancements often leverage large-scale foundation models for such tasks.
arXiv Detail & Related papers (2024-11-24T07:44:39Z) - Recurrent Neural Goodness-of-Fit Test for Time Series [8.22915954499148]
Time series data are crucial across diverse domains such as finance and healthcare.
Traditional evaluation metrics fall short due to the temporal dependencies and potential high dimensionality of the features.
We propose the REcurrent NeurAL (RENAL) Goodness-of-Fit test, a novel and statistically rigorous framework for evaluating generative time series models.
arXiv Detail & Related papers (2024-10-17T19:32:25Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - On the Identification of Temporally Causal Representation with Instantaneous Dependence [50.14432597910128]
Temporally causal representation learning aims to identify the latent causal process from time series observations.
Most methods require the assumption that the latent causal processes do not have instantaneous relations.
We propose an textbfIDentification framework for instantanetextbfOus textbfLatent dynamics.
arXiv Detail & Related papers (2024-05-24T08:08:05Z) - DAGnosis: Localized Identification of Data Inconsistencies using
Structures [73.39285449012255]
Identification and appropriate handling of inconsistencies in data at deployment time is crucial to reliably use machine learning models.
We use directed acyclic graphs (DAGs) to encode the training set's features probability distribution and independencies as a structure.
Our method, called DAGnosis, leverages these structural interactions to bring valuable and insightful data-centric conclusions.
arXiv Detail & Related papers (2024-02-26T11:29:16Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Assumption violations in causal discovery and the robustness of score matching [38.60630271550033]
This paper extensively benchmarks the empirical performance of recent causal discovery methods on observational i.i.d. data.
We show that score matching-based methods demonstrate surprising performance in the false positive and false negative rate of the inferred graph.
We hope this paper will set a new standard for the evaluation of causal discovery methods.
arXiv Detail & Related papers (2023-10-20T09:56:07Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - CausalTime: Realistically Generated Time-series for Benchmarking of
Causal Discovery [14.092834149864514]
This study introduces the CausalTime pipeline to generate time-series that highly resemble the real data.
The pipeline starts from real observations in a specific scenario and produces a matching benchmark dataset.
In the experiments, we validate the fidelity of the generated data through qualitative and quantitative experiments, followed by a benchmarking of existing TSCD algorithms.
arXiv Detail & Related papers (2023-10-03T02:29:19Z) - Time Series Data Imputation: A Survey on Deep Learning Approaches [4.4458738910060775]
Time series data imputation is a well-studied problem with different categories of methods.
Time series methods based on deep learning have made progress with the usage of models like RNN.
We will review and discuss their model architectures, their pros and cons as well as their effects to show the development of the time series imputation methods.
arXiv Detail & Related papers (2020-11-23T11:57:27Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.