Fugu-MT 論文翻訳(概要): On Adaptivity in Zeroth-Order Optimization

論文の概要: On Adaptivity in Zeroth-Order Optimization

arxiv url: http://arxiv.org/abs/2605.03869v1
Date: Tue, 05 May 2026 15:29:11 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-06 19:35:44.00661
Title: On Adaptivity in Zeroth-Order Optimization
Title（参考訳）: ゼロ階最適化における適応性について
Authors: Hassan Dbouk, Nidham Gazagnadou, Matthias Reisser, Christos Louizos,
Abstract要約: ZO-Adamのような適応ZO法は、よく調整されたZO-SGDに対して収束優位性を与えないことを示す。本稿では,グローバルステップサイズ適応のための1つのスカラーのみを追跡するメモリ効率の高い適応ZOであるMEAZOを提案する。
参考スコア（独自算出の注目度）: 16.620217856482377
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We investigate the effectiveness of adaptive zeroth-order (ZO) optimization for memory-constrained fine-tuning of large language models (LLMs). Contrary to prior claims, we show that adaptive ZO methods such as ZO-Adam offer no convergence advantage over well-tuned ZO-SGD, while incurring significant memory overhead. Our analysis reveals that in high dimensions, ZO gradients lack coordinate-wise heterogeneity, rendering adaptive mechanisms memory inefficient. Leveraging this insight, we propose MEAZO, a memory-efficient adaptive ZO optimizer that tracks only a single scalar for global step size adaptation. We support our method with theoretical convergence guarantees under standard assumptions. Experiments across multiple LLM families and tasks demonstrate that MEAZO matches ZO-Adam's performance with the memory footprint of ZO-SGD. Additional experiments on synthetic quadratic problems and LLM fine-tuning further demonstrate MEAZO's enhanced robustness to step size choices, particularly in grouped or block-structured optimization settings.
Abstract（参考訳）: 大規模言語モデル (LLM) のメモリ制約による微調整における適応ゼロ階数最適化(ZO)の有効性について検討する。従来の主張とは対照的に、ZO-Adamのような適応型ZOメソッドは、よく調整されたZO-SGDよりも収束性に優れているが、メモリオーバーヘッドは大きい。解析の結果,ZO勾配は座標的不均一性を欠き,適応機構のメモリ非効率化を図っている。この知見を利用して,グローバルステップサイズ適応のための1つのスカラーのみを追跡するメモリ効率の高い適応ZOオプティマイザMEAZOを提案する。我々は,理論収束保証を標準仮定の下で支援する。複数のLLMファミリーとタスクにわたる実験により、MEAZOはZO-AdamのパフォーマンスとZO-SGDのメモリフットプリントとを一致させることが示された。合成二次問題とLLM微調整に関するさらなる実験は、特にグループ化あるいはブロック構造最適化設定において、MEAZOの強化されたロバスト性を示す。

論文の概要: On Adaptivity in Zeroth-Order Optimization

関連論文リスト