Fugu-MT 論文翻訳(概要): Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters

論文の概要: Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters

arxiv url: http://arxiv.org/abs/2605.06032v1
Date: Thu, 07 May 2026 11:22:45 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.719966
Title: Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters
Title（参考訳）: 合成データは役に立つか? 深層学習時系列予測者による実証的証拠
Authors: Hugo Cazaux, Eyjólfur Ingi Ásgeirsson, Hlynur Stefánsson,
Abstract要約: 合成データは言語モデルの訓練に変化をもたらしたが、時系列予測におけるその役割はいまだに理解されていない。本研究では, 5 つのアーキテクチャ, 4 つの合成信号, 7 つのデータセットにまたがる合成時系列拡張を大規模に評価する実験的検討を行った。チャネルミキシングモデル(TimesNet、iTransformer)は試験の大部分で恩恵を受け、チャネルに依存しないモデルは一貫して劣化する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Synthetic data has transformed language model training, yet its role in time series forecasting remains poorly understood. We present a large-scale empirical study: nine experiment groups, 4,218 runs systematically evaluating synthetic time series augmentation across five architectures, four synthetic signals and seven datasets. The effect is sharply architecture-conditional: channel-mixing models (TimesNet, iTransformer) benefit in the majority of trials, while channel-independent models (DLinear, PatchTST) are consistently degraded. In selected low-resource settings the gains are striking: TimesNet trained on only 10\% of Weather data with synthetic augmentation surpasses the full-data baseline (4 of 16 sparsity-dataset combinations). Averaged across all architectures, augmentation hurts in 67\% of trials. We further find that only the Seasonal-Trend generator reliably helps across the tested benchmarks, and that hard curriculum switching is actively harmful (+24\% MSE degradation). These results provide concrete, actionable guidelines on how to use synthetic data: use synthetic augmentation with channel-mixing architectures, use gradual annealing schedules, and treat low-resource augmentation as architecture- and dataset-dependent. Code is available at \href{https://github.com/hugoiscracked/synthetic-ts/tree/main}
Abstract（参考訳）: 合成データは言語モデルの訓練に変化をもたらしたが、時系列予測におけるその役割はいまだに理解されていない。 9つの実験グループ、4,218は5つのアーキテクチャ、4つの合成信号、7つのデータセットにわたる合成時系列拡張を体系的に評価する。チャネルミキシングモデル(TimesNet, iTransformer)は試験の大部分で恩恵を受け、チャネル非依存モデル(DLinear, PatchTST)は一貫して劣化する。 TimesNetは、人工的な拡張による気象データの10%しかトレーニングしていないが、完全なデータベースライン(16の空間とデータセットの組み合わせのうち4つ)を超えている。すべてのアーキテクチャで平均化され、Augmentationは67%のトライアルで苦しむ。さらに、テストベンチマークにおいて、季節トレンド生成器のみが確実に有効であり、ハードカリキュラムの切り替えは積極的に有害である(+24\% MSE劣化)。これらの結果は、合成データの使用方法に関する具体的かつ実用的なガイドラインを提供する: チャネル混合アーキテクチャによる合成拡張の使用、段階的なアニールスケジュールの使用、低リソース拡張をアーキテクチャおよびデータセット依存として扱う。コードは \href{https://github.com/hugoiscracked/synthetic-ts/tree/main} で入手できる。

論文の概要: Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters

関連論文リスト