Fugu-MT 論文翻訳(概要): Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered

論文の概要: Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered

arxiv url: http://arxiv.org/abs/2605.15622v2
Date: Mon, 18 May 2026 07:21:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:46.106864
Title: Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered
Title（参考訳）: 位置: ディープラーニングにおけるゼロ階最適化は過小評価され、低パワーではない
Authors: Sijia Liu, Yicheng Lang, Soumyadeep Pal, Changsheng Wang, Yancheng Huang, Chongyu Fan, James Diffenderfer, Bhavya Kailkhura, Yihua Zhang,
Abstract要約: バックプロパゲーションを伴わない関数評価の有限差から学習するゼロ次最適化(ZO)は,近年,ディープラーニングにおいて注目を集めている。しかし、ZOメソッドは、推定値のばらつきと不都合なクエリの複雑さのため、基本的にはスケールできないとしばしば否定される。認識されている多くの制限は、特に、フルスペース、要素ワイド、推定器中心の設計といった、ミオピックな開発プラクティスに起因していることが示される。
参考スコア（独自算出の注目度）: 33.86334668191583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Zeroth-order (ZO) optimization, learning from finite differences of function evaluations without backpropagation, has recently regained attention in deep learning due to its memory efficiency and applicability to gray- or black-box pipelines. Yet, ZO methods are often dismissed as fundamentally unscalable because of estimator variance and unfavorable query complexity. We argue that this conclusion might be misguided: ZO optimization is underexplored, not underpowered. We show that many perceived limitations stem from myopic development practices, most notably full-space, element-wise, estimator-centric designs. We articulate six positions spanning the algorithmic, systems, and evaluation stack. First, we revisit the feasibility boundaries of estimator-centric ZO methods through variance control, variance-query tradeoffs, and directional-derivative lenses. Then, we identify three underexplored opportunities: (i) subspace and spectral views of ZO that enable interpretable variance reduction with graceful query scaling, (ii) the forward-only nature of ZO as a systems advantage for communication-efficient, pipeline-friendly, and resource-constrained training, and (iii) the need to de-obfuscate ZO evaluations from task complexity. We strongly advocate rethinking ZO optimization around its unique strengths and acting accordingly, opening a viable path toward large-scale, system-aware, and resource-efficient learning with ZO optimization.
Abstract（参考訳）: バックプロパゲーションのない関数評価の有限差分から学習するゼロオーダー最適化(ZO)は、近年、メモリ効率とグレーボックスやブラックボックスパイプラインへの適用性から、ディープラーニングの注目を集めている。しかし、ZOメソッドは、推定値のばらつきと不都合なクエリの複雑さのため、基本的にはスケールできないとしばしば否定される。 ZO最適化は過大評価されているが、過大評価されていない。認識されている多くの制限は、特に、フルスペース、要素ワイド、推定器中心の設計といった、ミオピックな開発プラクティスに起因していることが示される。アルゴリズム,システム,評価スタックにまたがる6つの位置を明確にする。まず, 分散制御, 分散クエリトレードオフ, 指向性導出レンズを用いて, 推定器中心ZO法の実現可能性境界を再検討する。そして、探索されていない3つの機会を特定します。 (i)優雅なクエリスケーリングによる解釈可能な分散低減を可能にするZOのサブスペースとスペクトルビュー。 (二)通信効率、パイプラインフレンドリー、資源制約のある訓練に有利なシステムとしてのZOの前方のみの性質、及び (iii)タスクの複雑さからZO評価を解き放つ必要性。我々は、ZO最適化をその独特な強みを中心に再考し、それに従って、ZO最適化による大規模、システム認識、資源効率の学習に向けて実行可能な道を開くことを強く主張する。

論文の概要: Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered

関連論文リスト