Fugu-MT 論文翻訳(概要): Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

論文の概要: Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

arxiv url: http://arxiv.org/abs/2302.03825v1
Date: Wed, 8 Feb 2023 01:42:45 GMT
ステータス: 翻訳完了
システム内更新日: 2023-02-09 17:44:01.637169
Title: Decentralized Riemannian Algorithm for Nonconvex Minimax Problems
Title（参考訳）: 非凸ミニマックス問題に対する分散リーマンアルゴリズム
Authors: Xidong Wu, Zhengmian Hu and Heng Huang
Abstract要約: ニューラルネットワークのためのミニマックスアルゴリズムは、多くの問題を解決するために開発された。本稿では,2種類のミニマックスアルゴリズムを提案する。そこで我々は, DRSGDAを提案し, 本手法が勾配を達成することを証明した。
参考スコア（独自算出の注目度）: 82.50374560598493
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold). Although many optimization algorithms for minimax problems have been developed in the Euclidean setting, it is difficult to convert them into Riemannian cases, and algorithms for nonconvex minimax problems with nonconvex constraints are even rare. On the other hand, to address the big data challenges, decentralized (serverless) training techniques have recently been emerging since they can reduce communications overhead and avoid the bottleneck problem on the server node. Nonetheless, the algorithm for decentralized Riemannian minimax problems has not been studied. In this paper, we study the distributed nonconvex-strongly-concave minimax optimization problem over the Stiefel manifold and propose both deterministic and stochastic minimax methods. The local model is non-convex strong-concave and the Steifel manifold is a non-convex set. The global function is represented as the finite sum of local functions. For the deterministic setting, we propose DRGDA and prove that our deterministic method achieves a gradient complexity of $O( \epsilon^{-2})$ under mild conditions. For the stochastic setting, we propose DRSGDA and prove that our stochastic method achieves a gradient complexity of $O(\epsilon^{-4})$. The DRGDA and DRSGDA are the first algorithms for distributed minimax optimization with nonconvex constraints with exact convergence. Extensive experimental results on the Deep Neural Networks (DNNs) training over the Stiefel manifold demonstrate the efficiency of our algorithms.
Abstract（参考訳）: リーマン多様体上のミニマックス最適化(おそらく非凸制約)は、ロバスト次元の縮小や直交重みを持つディープニューラルネットワーク(スティフェル多様体)のような多くの問題を解決するために積極的に適用されてきた。ユークリッド環境ではミニマックス問題の最適化アルゴリズムが数多く開発されているが、それらをリーマンケースに変換することは困難であり、非凸制約付きミニマックス問題のアルゴリズムはさらに稀である。一方で、ビッグデータの課題に対処するために、通信オーバーヘッドを削減し、サーバノードのボトルネック問題を回避するために、分散(サーバーレス)トレーニング技術が最近登場している。それでも分散リーマンミニマックス問題のアルゴリズムは研究されていない。本稿では,スタイフェル多様体上の分散非凸強凸ミニマックス最適化問題を研究し,決定論的および確率的ミニマックス法を提案する。局所モデルは非凸強凸であり、ステイフェル多様体は非凸集合である。大域関数は局所関数の有限和として表現される。決定論的設定のために、DRGDAを提案し、決定論的手法が穏やかな条件下で$O( \epsilon^{-2})$の勾配複雑性を達成することを証明した。確率的設定に対しては、DSSGDAを提案し、我々の確率的手法が$O(\epsilon^{-4})$の勾配複雑性を達成することを証明する。 DRGDAとDRSGDAは、厳密な収束を伴う非凸制約を持つ分散ミニマックス最適化のための最初のアルゴリズムである。 stiefel多様体上のディープニューラルネットワーク(dnns)トレーニングに関する広範な実験結果から,アルゴリズムの効率性が証明された。

論文の概要: Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

関連論文リスト