TSGBench: Time Series Generation Benchmark
- URL: http://arxiv.org/abs/2309.03755v2
- Date: Thu, 7 Dec 2023 13:42:53 GMT
- Title: TSGBench: Time Series Generation Benchmark
- Authors: Yihao Ang, Qiang Huang, Yifan Bao, Anthony K. H. Tung, Zhiyong Huang
- Abstract summary: textsfTSGBench is a unified and comprehensive assessment of Synthetic Time Series Generation methods.
It comprises three modules: (1) a curated collection of publicly available, real-world datasets tailored for TSG, together with a standardized preprocessing pipeline; (2) a comprehensive evaluation measures suite including vanilla measures, new distance-based assessments, and visualization tools; and (3) a pioneering generalization test rooted in Domain Adaptation (DA)
We have conducted experiments using textsfTSGBench across a spectrum of ten real-world datasets from diverse domains, utilizing ten advanced TSG methods and twelve evaluation measures.
- Score: 11.199605025284185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthetic Time Series Generation (TSG) is crucial in a range of applications,
including data augmentation, anomaly detection, and privacy preservation.
Although significant strides have been made in this field, existing methods
exhibit three key limitations: (1) They often benchmark against similar model
types, constraining a holistic view of performance capabilities. (2) The use of
specialized synthetic and private datasets introduces biases and hampers
generalizability. (3) Ambiguous evaluation measures, often tied to custom
networks or downstream tasks, hinder consistent and fair comparison.
To overcome these limitations, we introduce \textsf{TSGBench}, the inaugural
Time Series Generation Benchmark, designed for a unified and comprehensive
assessment of TSG methods. It comprises three modules: (1) a curated collection
of publicly available, real-world datasets tailored for TSG, together with a
standardized preprocessing pipeline; (2) a comprehensive evaluation measures
suite including vanilla measures, new distance-based assessments, and
visualization tools; (3) a pioneering generalization test rooted in Domain
Adaptation (DA), compatible with all methods. We have conducted comprehensive
experiments using \textsf{TSGBench} across a spectrum of ten real-world
datasets from diverse domains, utilizing ten advanced TSG methods and twelve
evaluation measures. The results highlight the reliability and efficacy of
\textsf{TSGBench} in evaluating TSG methods. Crucially, \textsf{TSGBench}
delivers a statistical analysis of the performance rankings of these methods,
illuminating their varying performance across different datasets and measures
and offering nuanced insights into the effectiveness of each method.
Related papers
- GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods [27.473935782550388]
Time series are generated in diverse domains such as economic, traffic, health, and energy.
We propose TFB, an automated benchmark for Time Series Forecasting (TSF) methods.
arXiv Detail & Related papers (2024-03-29T12:37:57Z) - Test-Time Domain Generalization for Face Anti-Spoofing [60.94384914275116]
Face Anti-Spoofing (FAS) is pivotal in safeguarding facial recognition systems against presentation attacks.
We introduce a novel Test-Time Domain Generalization framework for FAS, which leverages the testing data to boost the model's generalizability.
Our method, consisting of Test-Time Style Projection (TTSP) and Diverse Style Shifts Simulation (DSSS), effectively projects the unseen data to the seen domain space.
arXiv Detail & Related papers (2024-03-28T11:50:23Z) - Hyperspectral Benchmark: Bridging the Gap between HSI Applications
through Comprehensive Dataset and Pretraining [11.935879491267634]
Hyperspectral Imaging (HSI) serves as a non-destructive spatial spectroscopy technique with a multitude of potential applications.
A recurring challenge lies in the limited size of the target datasets, impeding exhaustive architecture search.
This study introduces an innovative benchmark dataset encompassing three markedly distinct HSI applications.
arXiv Detail & Related papers (2023-09-20T08:08:34Z) - Benchmarking Test-Time Adaptation against Distribution Shifts in Image
Classification [77.0114672086012]
Test-time adaptation (TTA) is a technique aimed at enhancing the generalization performance of models by leveraging unlabeled samples solely during prediction.
We present a benchmark that systematically evaluates 13 prominent TTA methods and their variants on five widely used image classification datasets.
arXiv Detail & Related papers (2023-07-06T16:59:53Z) - DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection [55.70982767084996]
A critical yet frequently overlooked challenge in the field of deepfake detection is the lack of a standardized, unified, comprehensive benchmark.
We present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions.
DeepfakeBench contains 15 state-of-the-art detection methods, 9CL datasets, a series of deepfake detection evaluation protocols and analysis tools, as well as comprehensive evaluations.
arXiv Detail & Related papers (2023-07-04T01:34:41Z) - TSI-GAN: Unsupervised Time Series Anomaly Detection using Convolutional
Cycle-Consistent Generative Adversarial Networks [2.4469484645516837]
Anomaly detection is widely used in network intrusion detection, autonomous driving, medical diagnosis, credit card frauds, etc.
This paper proposes TSI-GAN, an unsupervised anomaly detection model for time-series that can learn complex temporal patterns automatically.
We evaluate TSI-GAN using 250 well-curated and harder-than-usual datasets and compare with 8 state-of-the-art baseline methods.
arXiv Detail & Related papers (2023-03-22T23:24:47Z) - News Summarization and Evaluation in the Era of GPT-3 [73.48220043216087]
We study how GPT-3 compares against fine-tuned models trained on large summarization datasets.
We show that not only do humans overwhelmingly prefer GPT-3 summaries, prompted using only a task description, but these also do not suffer from common dataset-specific issues such as poor factuality.
arXiv Detail & Related papers (2022-09-26T01:04:52Z) - TRUE: Re-evaluating Factual Consistency Evaluation [29.888885917330327]
We introduce TRUE: a comprehensive study of factual consistency metrics on a standardized collection of existing texts from diverse tasks.
Our standardization enables an example-level meta-evaluation protocol that is more actionable and interpretable than previously reported correlations.
Across diverse state-of-the-art metrics and 11 datasets we find that large-scale NLI and question generation-and-answering-based approaches achieve strong and complementary results.
arXiv Detail & Related papers (2022-04-11T10:14:35Z) - WRENCH: A Comprehensive Benchmark for Weak Supervision [66.82046201714766]
benchmark consists of 22 varied real-world datasets for classification and sequence tagging.
We use benchmark to conduct extensive comparisons over more than 100 method variants to demonstrate its efficacy as a benchmark platform.
arXiv Detail & Related papers (2021-09-23T13:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.