Fugu-MT 論文翻訳(概要): There Will Be a Scientific Theory of Deep Learning

論文の概要: There Will Be a Scientific Theory of Deep Learning

arxiv url: http://arxiv.org/abs/2604.21691v1
Date: Thu, 23 Apr 2026 13:58:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-24 14:40:06.573129
Title: There Will Be a Scientific Theory of Deep Learning
Title（参考訳）: 深層学習の科学的理論
Authors: Jamie Simon, Daniel Kunin, Alexander Atanasov, Enric Boix-Adserà, Blake Bordelon, Jeremy Cohen, Nikhil Ghosh, Florentin Guth, Arthur Jacot, Mason Kamb, Dhruva Karkada, Eric J. Michaud, Berkan Ottlik, Joseph Turnbull,
Abstract要約: 我々は、深層学習の科学的理論が出現しつつあることを主張する。我々は、そのような理論を導く5つの成長する仕事の体を同定する。我々は、新しい理論は学習過程の力学と考えるのが一番良いと論じる。
参考スコア（独自算出の注目度）: 62.690977616250954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process, hidden representations, final weights, and performance of neural networks. We pull together major strands of ongoing research in deep learning theory and identify five growing bodies of work that point toward such a theory: (a) solvable idealized settings that provide intuition for learning dynamics in realistic systems; (b) tractable limits that reveal insights into fundamental learning phenomena; (c) simple mathematical laws that capture important macroscopic observables; (d) theories of hyperparameters that disentangle them from the rest of the training process, leaving simpler systems behind; and (e) universal behaviors shared across systems and settings which clarify which phenomena call for explanation. Taken together, these bodies of work share certain broad traits: they are concerned with the dynamics of the training process; they primarily seek to describe coarse aggregate statistics; and they emphasize falsifiable quantitative predictions. We argue that the emerging theory is best thought of as a mechanics of the learning process, and suggest the name learning mechanics. We discuss the relationship between this mechanics perspective and other approaches for building a theory of deep learning, including the statistical and information-theoretic perspectives. In particular, we anticipate a symbiotic relationship between learning mechanics and mechanistic interpretability. We also review and address common arguments that fundamental theory will not be possible or is not important. We conclude with a portrait of important open directions in learning mechanics and advice for beginners. We host further introductory materials, perspectives, and open questions at learningmechanics.pub.
Abstract（参考訳）: 本稿では,ディープラーニングの科学的理論が出現しつつあることを論じる。これにより、トレーニングプロセスの重要な特性と統計、隠れ表現、最終的な重み付け、ニューラルネットワークの性能を特徴づける理論を意味する。我々は、ディープラーニング理論における現在進行中の研究の大きな柱をまとめて、そのような理論を指している5つの成長する作業体を特定します。 (a)現実的なシステムの力学を学ぶための直観を提供する解決可能な理想化設定 b) 基本的な学習現象の洞察を明らかにすることができる限界 (c)重要なマクロ観測物を取得する単純な数学的法則 (d)過度パラメータの理論であって、訓練過程の残りの部分からそれらを切り離して、より単純なシステムを残しておくものであって、 (e)説明を求める現象を明らかにするシステムや設定間で共有される普遍的な行動。それらはトレーニングプロセスのダイナミクスに関係しており、主に粗い集約統計を記述することを目的としており、そして、虚偽の定量的予測を強調している。創発的理論は学習過程の力学と考えるのが最善であり、学習力学という名称を推奨する。我々は,この力学的視点と,統計的および情報理論的視点を含む深層学習理論を構築するための他のアプローチとの関係について論じる。特に,学習力学と機械的解釈可能性の共生関係を期待する。また、基本理論は不可能か重要でないという共通の議論をレビューし、解決する。学習機械学における重要な開放方向の肖像画と初心者へのアドバイスで締めくくる。我々はLearningmechanics.pubでさらに入門資料、視点、オープンな質問を開催する。

論文の概要: There Will Be a Scientific Theory of Deep Learning

関連論文リスト