ConTSG-Bench: A Unified Benchmark for Conditional Time Series Generation
- URL: http://arxiv.org/abs/2603.04767v1
- Date: Thu, 05 Mar 2026 03:30:52 GMT
- Title: ConTSG-Bench: A Unified Benchmark for Conditional Time Series Generation
- Authors: Shaocheng Lan, Shuqi Gu, Zhangzhi Xiong, Kan Ren,
- Abstract summary: Conditional time series generation plays a critical role in addressing data scarcity and enabling causal analysis in real-world applications.<n>We introduce the Conditional Time Series Generation Benchmark (ConTSG-Bench)<n>ConTSG-Bench comprises a large-scale, well-aligned dataset spanning diverse conditioning modalities and levels of semantic abstraction.
- Score: 11.663484746644615
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conditional time series generation plays a critical role in addressing data scarcity and enabling causal analysis in real-world applications. Despite its increasing importance, the field lacks a standardized and systematic benchmarking framework for evaluating generative models across diverse conditions. To address this gap, we introduce the Conditional Time Series Generation Benchmark (ConTSG-Bench). ConTSG-Bench comprises a large-scale, well-aligned dataset spanning diverse conditioning modalities and levels of semantic abstraction, first enabling systematic evaluation of representative generation methods across these dimensions with a comprehensive suite of metrics for generation fidelity and condition adherence. Both the quantitative benchmarking and in-depth analyses of conditional generation behaviors have revealed the traits and limitations of the current approaches, highlighting critical challenges and promising research directions, particularly with respect to precise structural controllability and downstream task utility under complex conditions.
Related papers
- It's TIME: Towards the Next Generation of Time Series Forecasting Benchmarks [87.7937890373758]
Time series foundation models (TSFMs) are revolutionizing the forecasting landscape from specific dataset modeling to generalizable task evaluation.<n>We introduce TIME, a next-generation task-centric benchmark comprising 50 fresh datasets and 98 forecasting tasks.<n>We propose a novel pattern-level evaluation perspective that moves beyond traditional dataset-level evaluations based on static meta labels.
arXiv Detail & Related papers (2026-02-12T16:31:01Z) - Contextual and Seasonal LSTMs for Time Series Anomaly Detection [49.50689313712684]
We propose a novel prediction-based framework named Contextual and Seasonal LSTMs (CS-LSTMs)<n>CS-LSTMs are built upon a noise decomposition strategy and jointly leverage contextual dependencies and seasonal patterns.<n>They consistently outperform state-of-the-art methods, highlighting their effectiveness and practical value in robust time series anomaly detection.
arXiv Detail & Related papers (2026-02-10T11:46:15Z) - Procrustean Bed for AI-Driven Retrosynthesis: A Unified Framework for Reproducible Evaluation [0.0]
RetroCast is a unified evaluation suite that standardizes heterogeneous model outputs into a common schema.<n>We evaluate leading search-based and sequence-based algorithms on a new suite of standardized benchmarks.
arXiv Detail & Related papers (2025-12-08T01:26:39Z) - RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering [50.42577862494645]
We present RAG-IGBench, a benchmark designed to evaluate the task of Interleaved Generation based on Retrieval-Augmented Generation (RAG-IG) in open-domain question answering.<n>RAG-IG integrates multimodal large language models (MLLMs) with retrieval mechanisms, enabling the models to access external image-text information for generating coherent multimodal content.
arXiv Detail & Related papers (2025-10-11T03:06:39Z) - Instance-Aware Robust Consistency Regularization for Semi-Supervised Nuclei Instance Segmentation [53.94176748542936]
We propose an Instance-Aware Robust Consistency Regularization Network (IRCR-Net) for accurate instance-level nuclei segmentation.<n>We incorporate morphological prior knowledge of nuclei in pathological images and utilize these priors to assess the quality of pseudo-labels generated from unlabeled data.
arXiv Detail & Related papers (2025-10-10T12:32:32Z) - SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions [54.34001921326444]
Speaker verification (SV) models are increasingly integrated into security, personalization, and access control systems.<n>Existing benchmarks evaluate only subsets of these conditions, missing others entirely.<n>We introduce SVeritas, a comprehensive Speaker Verification tasks benchmark suite, assessing SV systems under stressors like recording duration, spontaneity, content, noise, microphone distance, reverberation, channel mismatches, audio bandwidth, codecs, speaker age, and susceptibility to spoofing and adversarial attacks.
arXiv Detail & Related papers (2025-09-21T14:11:16Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - Evaluation of Retrieval-Augmented Generation: A Survey [13.633909177683462]
We provide a comprehensive overview of the evaluation and benchmarks of Retrieval-Augmented Generation (RAG) systems.
Specifically, we examine and compare several quantifiable metrics of the Retrieval and Generation components, such as relevance, accuracy, and faithfulness.
We then analyze the various datasets and metrics, discuss the limitations of current benchmarks, and suggest potential directions to advance the field of RAG benchmarks.
arXiv Detail & Related papers (2024-05-13T02:33:25Z) - Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis [70.78170766633039]
We address the need for means of assessing MTS forecasting proposals reliably and fairly.
BasicTS+ is a benchmark designed to enable fair, comprehensive, and reproducible comparison of MTS forecasting solutions.
We apply BasicTS+ along with rich datasets to assess the capabilities of more than 45 MTS forecasting solutions.
arXiv Detail & Related papers (2023-10-09T19:52:22Z) - Hyperspectral Benchmark: Bridging the Gap between HSI Applications
through Comprehensive Dataset and Pretraining [11.935879491267634]
Hyperspectral Imaging (HSI) serves as a non-destructive spatial spectroscopy technique with a multitude of potential applications.
A recurring challenge lies in the limited size of the target datasets, impeding exhaustive architecture search.
This study introduces an innovative benchmark dataset encompassing three markedly distinct HSI applications.
arXiv Detail & Related papers (2023-09-20T08:08:34Z) - Diffusion-based Conditional ECG Generation with Structured State Space
Models [2.299617836036273]
We propose SSSD-ECG for the generation of synthetic 12-lead electrocardiograms conditioned on more than 70 ECG statements.
Due to a lack of reliable baselines, we also propose conditional variants of two state-of-the-art unconditional generative models.
arXiv Detail & Related papers (2023-01-19T18:36:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.