Fugu-MT 論文翻訳(概要): Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient

論文の概要: Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient

arxiv url: http://arxiv.org/abs/2405.17761v1
Date: Tue, 28 May 2024 02:27:53 GMT
ステータス: 翻訳完了
システム内更新日: 2024-05-29 22:41:57.558908
Title: Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Title（参考訳）: 二重変数削減:一階勾配のない複合最適化問題に対する平滑化トリック
Authors: Hao Di, Haishan Ye, Yueling Zhang, Xiangyu Chang, Guang Dai, Ivor W. Tsang,
Abstract要約: ばらつき低減技術はサンプリングのばらつきを低減し、一階法(FO)とゼロ階法(ZO)の収束率を向上するように設計されている。複合最適化問題において、ZO法は、ランダム推定から導かれる座標ワイド分散と呼ばれる追加の分散に遭遇する。本稿では,ZPDVR法とZPDVR法を提案する。
参考スコア（独自算出の注目度）: 40.22217106270146
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. However, in composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random gradient estimation. To reduce this variance, prior works require estimating all partial derivatives, essentially approximating FO information. This approach demands O(d) function evaluations (d is the dimension size), which incurs substantial computational costs and is prohibitive in high-dimensional scenarios. This paper proposes the Zeroth-order Proximal Double Variance Reduction (ZPDVR) method, which utilizes the averaging trick to reduce both sampling and coordinate-wise variances. Compared to prior methods, ZPDVR relies solely on random gradient estimates, calls the stochastic zeroth-order oracle (SZO) in expectation $\mathcal{O}(1)$ times per iteration, and achieves the optimal $\mathcal{O}(d(n + \kappa)\log (\frac{1}{\epsilon}))$ SZO query complexity in the strongly convex and smooth setting, where $\kappa$ represents the condition number and $\epsilon$ is the desired accuracy. Empirical results validate ZPDVR's linear convergence and demonstrate its superior performance over other related methods.
Abstract（参考訳）: ばらつき低減技術はサンプリングのばらつきを低減し、一階法(FO)とゼロ階法(ZO)の収束率を向上するように設計されている。しかし、複合最適化問題では、ZO法は、ランダム勾配推定から導かれる座標ワイド分散と呼ばれる追加の分散に遭遇する。この分散を減らすために、先行研究はすべての偏微分を推定し、基本的にFO情報を近似する必要がある。このアプローチは O(d) 関数の評価(d は次元サイズ)を必要とするが、これはかなりの計算コストを発生させ、高次元シナリオでは禁忌である。本稿では,ZPDVR法とZPDVR法を提案する。従来の手法と比較して、ZPDVRはランダムな勾配推定にのみ依存し、確率的ゼロ次オラクル (SZO) を 1 回当たり $\mathcal{O}(1)$ times と定義し、最適な $\mathcal{O}(d(n + \kappa)\log (\frac{1}{\epsilon}))$ SZO クエリの複雑さを強い凸と滑らかな設定で達成し、$\kappa$ は条件番号を表し、$\epsilon$ は所望の精度である。実験により、ZPDVRの線形収束を検証し、他の関連手法よりも優れた性能を示す。

関連論文リスト

VAMO: Efficient Large-Scale Nonconvex Optimization via Adaptive Zeroth Order Variance Reduction [3.130722489512822]
VAMOは、ZOGスタイルのフレームワークの下で、FOミニバッチ勾配とZO有限差分プローブを組み合わせる。 VAMOはFO法やZO法よりも優れており、効率を向上させるためにより高速で柔軟な選択肢を提供する。
論文参考訳（メタデータ） (2025-05-20T05:31:15Z)
Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
複素勾配問題に対する2つの分散化ZO推定器を提案する。我々は、現在最先端の機能複雑性を$mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$から$tildecalOleft(fracdepsilon2right)$に改善する。
論文参考訳（メタデータ） (2024-10-03T15:04:01Z)
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions [26.543628010637036]
非函数に対して$mathcalO(log T)$の最適収束率を達成する新しい適応還元法を導入する。また、提案手法を拡張して、合成最適化のために$mathcalO(log T)$と同じ最適率を得る。
論文参考訳（メタデータ） (2024-06-04T04:39:51Z)
Efficiently Escaping Saddle Points for Non-Convex Policy Optimization [40.0986936439803]
政策勾配(PG)は、拡張性と優れた性能のために強化学習に広く用いられている。本稿では,ヘッセンベクトル積 (HVP) の形で二階情報を用いた分散還元二階法を提案し,サンプルの複雑さを$tildeO(epsilon-3)$とする近似二階定常点 (SOSP) に収束する。
論文参考訳（メタデータ） (2023-11-15T12:36:45Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
信頼領域(TR)と立方体を用いた適応正則化は、非常に魅力的な理論的性質を持つことが証明されている。 TR法とARC法はヘッセン関数,勾配関数,関数値の非コンパクトな計算を同時に行うことができることを示す。
論文参考訳（メタデータ） (2023-10-18T10:29:58Z)
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction [26.9632099249085]
AdaSPSとAdaSLSと呼ばれる2種類の新しいSPSとSLSを提案し、非補間条件における収束を保証する。我々は, AdaSPS と AdaSLS に新しい分散低減技術を導入し, $smashwidetildemathcalO(n+1/epsilon)$グラデーション評価を必要とするアルゴリズムを得る。
論文参考訳（メタデータ） (2023-08-11T10:17:29Z)
Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization [49.58290066287418]
構成問題の複雑さを軽減するために,MSVR (Multi-block-probe Variance Reduced) という新しい手法を提案する。本研究の結果は, 試料の複雑さの順序や強靭性への依存など, 様々な面で先行して改善された。
論文参考訳（メタデータ） (2022-07-18T12:03:26Z)
A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization [0.0]
減少勾配(SVRG)の性能を向上させるために, 分散制御勾配(VCSG)という新しい手法を提案する。ラムダ$はVCSGで導入され、SVRGによる分散の過剰還元を避ける。 $mathcalO(min1/epsilon3/2,n1/4/epsilon)$ 勾配評価の数。
論文参考訳（メタデータ） (2021-02-19T12:22:56Z)
Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds [72.63031036770425]
有界非次元最適化のための差分プライベート(DP)アルゴリズムを提案する。標準勾配法に対する経験的優位性について,2つの一般的なディープラーニング手法を実証する。
論文参考訳（メタデータ） (2020-06-24T06:01:24Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
本稿では、勾配のないミニマックス最適化問題の大きさを非強設定で表現する。本稿では,新しいゼロ階分散還元降下アルゴリズムが,クエリの複雑さを最もよく表すことを示す。
論文参考訳（メタデータ） (2020-06-16T17:55:46Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。