Fugu-MT 論文翻訳(概要): Anderson acceleration of coordinate descent

論文の概要: Anderson acceleration of coordinate descent

arxiv url: http://arxiv.org/abs/2011.10065v3
Date: Thu, 28 Oct 2021 16:17:24 GMT
ステータス: 翻訳完了
システム内更新日: 2022-09-23 20:33:55.486706
Title: Anderson acceleration of coordinate descent
Title（参考訳）: 座標降下のアンダーソン加速度
Authors: Quentin Bertrand and Mathurin Massias
Abstract要約: 複数の機械学習問題において、座標降下はフルグレードの手法よりも性能が大幅に向上する。本稿では,外挿による座標降下の高速化版を提案する。
参考スコア（独自算出の注目度）: 5.794599007795348
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Acceleration of first order methods is mainly obtained via inertial techniques \`a la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.
Abstract（参考訳）: 一階法の加速は主に慣性的手法 \`a la nesterov または非線形外挿によって得られる。後者は近年の関心の高まりを知っており、勾配法や近位勾配法への応用に成功している。複数の機械学習問題において、座標降下は完全階調法よりもはるかに優れた性能を達成する。慣性的に加速された座標降下のバージョンは理論的に加速されるが、必ずしも実用的な速度アップにつながるとは限らない。本研究では,外挿による座標降下を高速化し,慣性加速座標降下と外挿勾配降下と比較し,実際にかなりの速度向上を示した。最小二乗、ラッソ、弾性ネット、ロジスティック回帰の実験がこのアプローチを検証する。

関連論文リスト

Acceleration and Implicit Regularization in Gaussian Phase Retrieval [5.484345596034159]
この設定では、Polyak や Nesterov の運動量の暗黙的な正規化による手法が、よい凸降下を保証することを証明している。実験的な証拠は、これらの手法が実際には勾配降下よりも早く収束していることを示している。
論文参考訳（メタデータ） (2023-11-21T04:10:03Z)
ELRA: Exponential learning rate adaption gradient descent optimization method [83.88591755871734]
我々は, 高速(指数率), ab initio(超自由)勾配に基づく適応法を提案する。本手法の主な考え方は,状況認識による$alphaの適応である。これは任意の次元 n の問題に適用でき、線型にしかスケールできない。
論文参考訳（メタデータ） (2023-09-12T14:36:13Z)
Accelerated gradient methods for nonconvex optimization: Escape trajectories from strict saddle points and convergence to local minima [9.66353475649122]
本稿ではその問題を考察する。加速勾配法の一般一般の凸挙動を理解すること。非アプティック関数。これは、運動量可変ネステロフの加速法(NAG)が、厳密なサドル点をほぼ確実に避けていることを示している。
論文参考訳（メタデータ） (2023-07-13T19:11:07Z)
Proximal Subgradient Norm Minimization of ISTA and FISTA [8.261388753972234]
反復収縮保持アルゴリズムのクラスに対する2乗近位次数ノルムは逆2乗率で収束することを示す。また、高速反復収縮保持アルゴリズム (FISTA) のクラスに対する2乗次次数次ノルムが、逆立方レートで収束するように加速されることも示している。
論文参考訳（メタデータ） (2022-11-03T06:50:19Z)
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models [158.19276683455254]
アダプティブ勾配アルゴリズムは、重ボール加速の移動平均アイデアを借用し、勾配の第1次モーメントを正確に推定し、収束を加速する。ネステロフ加速は、理論上はボール加速よりも早く収束し、多くの経験的ケースでも収束する。本稿では,計算勾配の余分な計算とメモリオーバーヘッドを回避するため,Nesterov運動量推定法(NME)を提案する。 Adan は視覚変換器 (ViT と CNN) で対応する SoTA を上回り,多くの人気ネットワークに対して新たな SoTA を設定する。
論文参考訳（メタデータ） (2022-08-13T16:04:39Z)
On Training Implicit Models [75.20173180996501]
ファントム勾配(ファントム勾配)と呼ばれる暗黙モデルに対する新しい勾配推定法を提案し、正確な勾配の計算コストを抑える。大規模タスクの実験では、これらの軽量ファントム勾配が暗黙の訓練モデルの後方通過を約1.7倍加速することを示した。
論文参考訳（メタデータ） (2021-11-09T14:40:24Z)
Adapting Stepsizes by Momentumized Gradients Improves Optimization and Generalization [89.66571637204012]
textscAdaMomentum on vision, and achieves state-the-art results on other task including language processing。 textscAdaMomentum on vision, and achieves state-the-art results on other task including language processing。 textscAdaMomentum on vision, and achieves state-the-art results on other task including language processing。
論文参考訳（メタデータ） (2021-06-22T03:13:23Z)
Scaling transition from momentum stochastic gradient descent to plain stochastic gradient descent [1.7874193862154875]
運動量勾配降下は、蓄積された勾配を電流パラメータの更新方向として利用する。平坦勾配降下は, 累積勾配により補正されていない。 TSGDアルゴリズムは訓練速度が速く、精度が高く、安定性も向上している。
論文参考訳（メタデータ） (2021-06-12T11:42:04Z)
Decreasing scaling transition from adaptive gradient descent to stochastic gradient descent [1.7874193862154875]
本稿では,適応勾配降下法から勾配勾配降下法DSTAdaへのスケーリング遷移を減少させる手法を提案する。実験の結果,DSTAdaは高速で精度が高く,安定性と堅牢性も向上した。
論文参考訳（メタデータ） (2021-06-12T11:28:58Z)
Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
目的関数の非一様洗練は、emphNon-uniform Smoothness(NS)とemphNon-uniform Lojasiewicz inequality(NL)につながる新しい定義は、古典的な$Omega (1/t2)$下界よりも早く大域的最適性に収束する新しい幾何学的一階法を刺激する。
論文参考訳（メタデータ） (2021-05-13T04:23:07Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
双線形問題に対処するために、CoGDアルゴリズム(Cogradient Descent Algorithm)を導入する。一方の変数は、他方の変数との結合関係を考慮し、同期勾配降下をもたらす。本アルゴリズムは,空間的制約下での1変数の問題を解くために応用される。
論文参考訳（メタデータ） (2020-06-16T13:41:54Z)
Adaptive Gradient Methods Can Be Provably Faster than SGD after Finite Epochs [25.158203665218164]
適応勾配法は有限時間後にランダムシャッフルSGDよりも高速であることを示す。我々の知る限り、適応的勾配法は有限時間後にSGDよりも高速であることを示すのはこれが初めてである。
論文参考訳（メタデータ） (2020-06-12T09:39:47Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
適応アルゴリズムは勾配の歴史を用いて勾配を更新し、深層ニューラルネットワークのトレーニングにおいてユビキタスである。本稿では,非コンケーブ最小値問題に対するOptimisticOAアルゴリズムの変種を解析する。実験の結果,適応型GAN非適応勾配アルゴリズムは経験的に観測可能であることがわかった。
論文参考訳（メタデータ） (2019-12-26T22:10:10Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。