Fugu-MT 論文翻訳(概要): Overcoming the Modality Gap in Context-Aided Forecasting

論文の概要: Overcoming the Modality Gap in Context-Aided Forecasting

arxiv url: http://arxiv.org/abs/2603.12451v2
Date: Mon, 16 Mar 2026 19:46:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 13:19:43.843845
Title: Overcoming the Modality Gap in Context-Aided Forecasting
Title（参考訳）: コンテキスト支援型予測におけるモダリティギャップの克服
Authors: Vincent Zhihao Zheng, Étienne Marcotte, Arjun Ashok, Andrew Robert Williams, Lijun Sun, Alexandre Drouin, Valentina Zantedeschi,
Abstract要約: 本稿では,時間的力学の記述と数値履歴に相補的な文脈を生成する半合成データ拡張手法を提案する。このアプローチによって大規模なデータセット生成が可能になり、700万のコンテキスト拡張時系列ウィンドウのコーパスであるCAF-7Mが実現される。
参考スコア（独自算出の注目度）: 54.976964834365056
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Context-aided forecasting (CAF) holds promise for integrating domain knowledge and forward-looking information, enabling AI systems to surpass traditional statistical methods. However, recent empirical studies reveal a puzzling gap: multimodal models often fail to outperform their unimodal counterparts. We hypothesize that this underperformance stems from poor context quality in existing datasets, as verification is challenging. To address these limitations, we introduce a semi-synthetic data augmentation method that generates contexts both descriptive of temporal dynamics and verifiably complementary to numerical histories. This approach enables massive-scale dataset creation, resulting in CAF-7M, a corpus of 7 million context-augmented time series windows, including a rigorously verified test set. We demonstrate that semi-synthetic pre-training transfers effectively to real-world evaluation, and show clear evidence of context utilization. Our results suggest that dataset quality, rather than architectural limitations, has been the primary bottleneck in context-aided forecasting.
Abstract（参考訳）: コンテキスト支援予測(CAF)は、ドメイン知識と前方視情報を統合することを約束し、AIシステムが従来の統計手法を超えることを可能にする。しかし、近年の実証研究では、マルチモーダルモデルが不定型モデルよりも優れていることがしばしばあるという、不合理なギャップが明らかになっている。この過小評価は、検証が難しいため、既存のデータセットのコンテキスト品質が低いことが原因である、という仮説を立てる。これらの制約に対処するために,時間的ダイナミクスの記述と数値履歴の相補性の両方の文脈を生成する半合成データ拡張手法を提案する。このアプローチは大規模なデータセット生成を可能にし、厳格に検証されたテストセットを含む700万のコンテキスト拡張時系列ウィンドウのコーパスであるCAF-7Mを実現する。本研究では,半合成事前学習が実世界の評価に有効であることを示すとともに,文脈利用の明確な証拠を示す。この結果から,アーキテクチャ上の制約よりもデータセットの品質が,コンテキスト支援予測における主要なボトルネックとなっていることが示唆された。

論文の概要: Overcoming the Modality Gap in Context-Aided Forecasting

関連論文リスト