Fugu-MT 論文翻訳(概要): Why Do Transformers Fail to Forecast Time Series In-Context?

論文の概要: Why Do Transformers Fail to Forecast Time Series In-Context?

arxiv url: http://arxiv.org/abs/2510.09776v1
Date: Fri, 10 Oct 2025 18:34:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:29.620202
Title: Why Do Transformers Fail to Forecast Time Series In-Context?
Title（参考訳）: なぜトランスフォーマーは時系列のインコンテキスト予測に失敗したのか?
Authors: Yufa Zhou, Yixiao Wang, Surbhi Goel, Anru R. Zhang,
Abstract要約: 時系列予測(TSF)は、機械学習において難しい問題であり、ほとんど未解決である。実証的な証拠は、パワフルなトランスフォーマーでさえ、より単純なモデルに勝てないことを一貫して示している。 In-Context Learning(ICL)理論のレンズを用いて,トランスフォーマーのTSF制限に関する理論的解析を行う。
参考スコア（独自算出の注目度）: 21.43699354236011
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series forecasting (TSF) remains a challenging and largely unsolved problem in machine learning, despite significant recent efforts leveraging Large Language Models (LLMs), which predominantly rely on Transformer architectures. Empirical evidence consistently shows that even powerful Transformers often fail to outperform much simpler models, e.g., linear models, on TSF tasks; however, a rigorous theoretical understanding of this phenomenon remains limited. In this paper, we provide a theoretical analysis of Transformers' limitations for TSF through the lens of In-Context Learning (ICL) theory. Specifically, under AR($p$) data, we establish that: (1) Linear Self-Attention (LSA) models $\textit{cannot}$ achieve lower expected MSE than classical linear models for in-context forecasting; (2) as the context length approaches to infinity, LSA asymptotically recovers the optimal linear predictor; and (3) under Chain-of-Thought (CoT) style inference, predictions collapse to the mean exponentially. We empirically validate these findings through carefully designed experiments. Our theory not only sheds light on several previously underexplored phenomena but also offers practical insights for designing more effective forecasting architectures. We hope our work encourages the broader research community to revisit the fundamental theoretical limitations of TSF and to critically evaluate the direct application of increasingly sophisticated architectures without deeper scrutiny.
Abstract（参考訳）: 時系列予測(TSF)は、Transformerアーキテクチャに大きく依存するLarge Language Models(LLM)を活用した最近の大きな取り組みにもかかわらず、機械学習において難しい問題であり、ほとんど未解決の課題である。実証的な証拠は、強力なトランスフォーマーでさえ、TSFのタスクにおいてより単純なモデル、例えば線形モデルよりもはるかに優れていないことを示しているが、この現象の厳密な理論的理解は依然として限られている。本稿では、ICL(In-Context Learning)理論のレンズを用いて、トランスフォーマーのTLF制限に関する理論的解析を行う。具体的には、AR($p$)データの下では、(1)線形自己注意(LSA)モデル$\textit{cannot}$は、インコンテキスト予測の古典的線形モデルよりも低いMSEを達成する、(2)無限大へのコンテキスト長アプローチとして、LSAは、最適線形予測器を漸近的に回復する、(3)チェーン・オブ・ソート(CoT)スタイルの推論では、予測は平均的に崩壊する。慎重に設計した実験により,これらの知見を実証的に検証した。我々の理論は、これまで未解明だったいくつかの現象に光を当てるだけでなく、より効果的な予測アーキテクチャを設計するための実践的な洞察を提供する。我々の研究は、幅広い研究コミュニティに対して、TSFの基本的な理論的限界を再考し、より深い精査なしに、より高度なアーキテクチャの直接的な適用を批判的に評価することを奨励します。

論文の概要: Why Do Transformers Fail to Forecast Time Series In-Context?

関連論文リスト