Fugu-MT 論文翻訳(概要): Improved Global Guarantees for the Nonconvex Burer--Monteiro Factorization via Rank Overparameterization

論文の概要: Improved Global Guarantees for the Nonconvex Burer--Monteiro Factorization via Rank Overparameterization

arxiv url: http://arxiv.org/abs/2207.01789v2
Date: Mon, 8 Jul 2024 10:58:33 GMT
ステータス: 翻訳完了
システム内更新日: 2024-07-10 06:02:09.132642
Title: Improved Global Guarantees for the Nonconvex Burer--Monteiro Factorization via Rank Overparameterization
Title（参考訳）: 非凸バーラのグローバル保証の改善--ランクオーバーパラメトリゼーションによるモンテイロ因子化
Authors: Richard Y. Zhang,
Abstract要約: 二つの微分可能な$L$-smooth, $mu$-strongly convex objective $phimph over a $ntimes n$frac14(L/mu-1)2rstar$を考える。非局所性にもかかわらず、局所最適化は、任意の初期点から大域的最適点へグローバルに収束することが保証される。
参考スコア（独自算出の注目度）: 10.787390511207683
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider minimizing a twice-differentiable, $L$-smooth, and $\mu$-strongly convex objective $\phi$ over an $n\times n$ positive semidefinite matrix $M\succeq0$, under the assumption that the minimizer $M^{\star}$ has low rank $r^{\star}\ll n$. Following the Burer--Monteiro approach, we instead minimize the nonconvex objective $f(X)=\phi(XX^{T})$ over a factor matrix $X$ of size $n\times r$. This substantially reduces the number of variables from $O(n^{2})$ to as few as $O(n)$ and also enforces positive semidefiniteness for free, but at the cost of giving up the convexity of the original problem. In this paper, we prove that if the search rank $r\ge r^{\star}$ is overparameterized by a \emph{constant factor} with respect to the true rank $r^{\star}$, namely as in $r>\frac{1}{4}(L/\mu-1)^{2}r^{\star}$, then despite nonconvexity, local optimization is guaranteed to globally converge from any initial point to the global optimum. This significantly improves upon a previous rank overparameterization threshold of $r\ge n$, which we show is sharp in the absence of smoothness and strong convexity, but would increase the number of variables back up to $O(n^{2})$. Conversely, without rank overparameterization, we prove that such a global guarantee is possible if and only if $\phi$ is almost perfectly conditioned, with a condition number of $L/\mu<3$. Therefore, we conclude that a small amount of overparameterization can lead to large improvements in theoretical guarantees for the nonconvex Burer--Monteiro factorization.
Abstract（参考訳）: 二つの微分可能な$L$-smooth と $\mu$-strongly convex objective $\phi$ over a $n\times n$ positive semidefinite matrix $M\succeq0$ を最小化することを考える。 Burer--Monteiro のアプローチに従い、非凸対象 $f(X)=\phi(XX^{T})$ を因子行列 $X$ of size $n\times r$ で最小化する。これにより、変数の数は$O(n^{2})$から$O(n)$まで大幅に減少し、また、元の問題の凸性を放棄するコストで、自由な半有限性も強制する。本稿では、サーチランク$r\ge r^{\star}$が真のランク$r^{\star}$、すなわち$r>\frac{1}{4}(L/\mu-1)^{2}r^{\star}$に対して超パラメータ化されている場合、非凸性にもかかわらず、局所最適化は任意の初期点から大域的最適点へ大域的に収束することを保証している。これは以前の階数オーバーパラメトリゼーションしきい値である$r\ge n$に対して著しく改善され、滑らかさと強い凸性が欠如していることが示されるが、変数の数が$O(n^{2})$まで増加する。逆に、階数オーバーパラメトリゼーションがなければ、そのような大域的な保証が可能であることは、$\phi$ がほぼ完全に条件付きであることと、条件番号 $L/\mu<3$ が成立することを証明している。したがって、小量の過パラメータ化は、非凸バーラ-モンテイロ分解の理論的保証に大きな改善をもたらす可能性があると結論付けている。

関連論文リスト

Efficient Continual Finite-Sum Minimization [52.5238287567572]
連続有限サム最小化(continuous finite-sum minimization)と呼ばれる有限サム最小化の鍵となるツイストを提案する。我々のアプローチは$mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$で大幅に改善されます。また、$mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$という自然な一階法は存在しないことを証明し、この方法の第一階法がほぼ密であることを示す。
論文参考訳（メタデータ） (2024-06-07T08:26:31Z)
Convergence of Gradient Descent with Small Initialization for Unregularized Matrix Completion [21.846732043706318]
バニラ勾配降下は、明示的な正則化を必要とせず、必ず基底真理$rmXstar$に収束することを示す。驚くべきことに、収束率も最終的な精度もオーバーパラメータ化された検索ランク$r'$に依存しておらず、それらは真のランク$r$によってのみ支配される。
論文参考訳（メタデータ） (2024-02-09T19:39:23Z)
On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [59.65871549878937]
本稿では、RMSPropとその運動量拡張を考察し、$frac1Tsum_k=1Tの収束速度を確立する。我々の収束率は、次元$d$を除くすべての係数に関して下界と一致する。収束率は$frac1Tsum_k=1Tと類似していると考えられる。
論文参考訳（メタデータ） (2024-02-01T07:21:32Z)
A note on $L^1$-Convergence of the Empiric Minimizer for unbounded functions with fast growth [0.0]
V : mathbbRd to mathbbR$ coercive に対して、経験的最小値の$L1$-distance の収束率について検討する。一般に、高速な成長を持つ非有界函数に対しては、収束率は上述の$a_n n-1/q$で制限され、$q$は潜在確率変数の次元である。
論文参考訳（メタデータ） (2023-03-08T08:46:13Z)
On the Complexity of Decentralized Smooth Nonconvex Finite-Sum Optimization [21.334985032433778]
分散最適化問題 $min_bf xinmathbb Rd f(bf x)triq frac1msum_i=1m f_i(bf x)triq frac1nsum_j=1n。
論文参考訳（メタデータ） (2022-10-25T11:37:11Z)
A spectral least-squares-type method for heavy-tailed corrupted regression with unknown covariance \& heterogeneous noise [2.019622939313173]
重み付き最小二乗線形回帰は、少なくとも$epsilon n$ arbitrary outliersの$n$のラベル特徴サンプルを破損させたと仮定して再検討する。本稿では,$(Sigma,Xi) や $Xi$ の演算ノルムに関する知識を前提に,電力法に基づくほぼ最適に計算可能な推定器を提案する。
論文参考訳（メタデータ） (2022-09-06T23:37:31Z)
Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification [14.674494335647841]
非函数 $f(X)=phi(XXT)$ を$ntimes r$ factor matrix $X$ で最小化するために勾配降下を考える。本稿では,勾配降下の収束率を線形に戻すための安価なプレコンディショナーを提案する。
論文参考訳（メタデータ） (2022-06-07T14:26:49Z)
Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
我々は、任意のSchatten-$p$ノルムの下で、低ランク近似のためのクリロフ部分空間に基づく反復法について研究する。我々の主な成果は、$tildeO(k/sqrtepsilon)$ matrix-vector productのみを使用するアルゴリズムである。
論文参考訳（メタデータ） (2022-02-10T16:10:41Z)
Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
対象ベクトルの少数のエントリのみを問合せすることを目的とした線形回帰のためのアクティブサンプリングアルゴリズムについて検討する。我々はこの$d$への依存が対数的要因まで最適であることを示す。また、損失関数に対して最初の全感度上界$O(dmax1,p/2log2 n)$を提供し、最大で$p$成長する。
論文参考訳（メタデータ） (2021-11-09T00:20:01Z)
Convergence Rate of the (1+1)-Evolution Strategy with Success-Based Step-Size Adaptation on Convex Quadratic Functions [20.666734673282498]
1+1)-進化戦略(ES)と成功に基づくステップサイズ適応を一般凸二次関数で解析する。 1+1)-ES の収束速度は、一般凸二次函数上で明示的に厳密に導かれる。
論文参考訳（メタデータ） (2021-03-02T09:03:44Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
対数要因まで $(varepsilon,delta)$-differently private の最適過剰人口損失は $sqrtlog(d)/n + sqrtd/varepsilon n.$ です。損失関数がさらなる滑らかさの仮定を満たすとき、余剰損失は$sqrtlog(d)/n + (log(d)/varepsilon n)2/3で上界(対数因子まで)であることが示される。
論文参考訳（メタデータ） (2021-03-02T06:53:44Z)
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness [151.67113334248464]
我々は、他の攻撃モデルに対してスムースな手法を拡張することは困難であることを示す。我々はCIFARに関する実験結果を示し,その理論を検証した。
論文参考訳（メタデータ） (2020-02-08T22:02:14Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。