Fugu-MT 論文翻訳(概要): Efficient Over-parameterized Matrix Sensing from Noisy Measurements via Alternating Preconditioned Gradient Descent

論文の概要: Efficient Over-parameterized Matrix Sensing from Noisy Measurements via Alternating Preconditioned Gradient Descent

arxiv url: http://arxiv.org/abs/2502.00463v2
Date: Thu, 06 Feb 2025 14:32:10 GMT
ステータス: 翻訳完了
システム内更新日: 2025-02-07 13:23:05.911695
Title: Efficient Over-parameterized Matrix Sensing from Noisy Measurements via Alternating Preconditioned Gradient Descent
Title（参考訳）: 交互プレコンディショニングによる雑音測定による過パラメータ化マトリックスの高効率検出
Authors: Zhiyu Liu, Zhi Han, Yandong Tang, Hai Zhang, Shaojie Tang, Yao Wang,
Abstract要約: 行列センシング問題の収束を早めるためのプレコンディショニング手法が提案されている。本稿では,2因子パラメータを交互に更新するAPGDアルゴリズムを提案する。理論的には、任意の乱数から始まる線形速度で APGD が準最適収束を達成することを証明している。
参考スコア（独自算出の注目度）: 17.73720530889677
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the noisy matrix sensing problem in the over-parameterization setting, where the estimated rank $r$ is larger than the true rank $r_\star$. Specifically, our main objective is to recover a matrix $ X_\star \in \mathbb{R}^{n_1 \times n_2} $ with rank $ r_\star $ from noisy measurements using an over-parameterized factorized form $ LR^\top $, where $ L \in \mathbb{R}^{n_1 \times r}, \, R \in \mathbb{R}^{n_2 \times r} $ and $ \min\{n_1, n_2\} \ge r > r_\star $, with the true rank $ r_\star $ being unknown. Recently, preconditioning methods have been proposed to accelerate the convergence of matrix sensing problem compared to vanilla gradient descent, incorporating preconditioning terms $ (L^\top L + \lambda I)^{-1} $ and $ (R^\top R + \lambda I)^{-1} $ into the original gradient. However, these methods require careful tuning of the damping parameter $\lambda$ and are sensitive to initial points and step size. To address these limitations, we propose the alternating preconditioned gradient descent (APGD) algorithm, which alternately updates the two factor matrices, eliminating the need for the damping parameter and enabling faster convergence with larger step sizes. We theoretically prove that APGD achieves near-optimal error convergence at a linear rate, starting from arbitrary random initializations. Through extensive experiments, we validate our theoretical results and demonstrate that APGD outperforms other methods, achieving the fastest convergence rate. Notably, both our theoretical analysis and experimental results illustrate that APGD does not rely on the initialization procedure, making it more practical and versatile.
Abstract（参考訳）: 過パラメータ化設定では、推定階数 $r$ が真の階数 $r_\star$ よりも大きいノイズ行列センシング問題を考える。具体的には、行列 $ X_\star \in \mathbb{R}^{n_1 \times n_2} $ 階数 $ r_\star $ を、過度にパラメータ化された分解形式 $ LR^\top $, $ L \in \mathbb{R}^{n_1 \times r}, \, R \in \mathbb{R}^{n_2 \times r} $ と $ \min\{n_1, n_2\} \ge r > r_\star $ で、真の階数 $ r_\star $ は未知である。近年、行列知覚問題の収束をバニラ勾配よりも促進するために、事前条件項 $ (L^\top L + \lambda I)^{-1} $ と $ (R^\top R + \lambda I)^{-1} $ を元の勾配に組み込んだプレコンディショニング法が提案されている。しかし、これらのメソッドはダンピングパラメータ$\lambda$の注意深いチューニングを必要とし、初期点とステップサイズに敏感である。これらの制約に対処するために,2つの係数行列を交互に更新し,減衰パラメータの必要をなくし,より大きなステップサイズでより高速な収束を可能にする,交互事前条件勾配降下(APGD)アルゴリズムを提案する。 APGDが任意のランダム初期化から始まる線形速度でほぼ最適誤差収束を実現することを理論的に証明する。実験により, APGDが他の手法よりも優れ, 収束速度が最速であることが確認された。特に, 理論解析と実験結果から, APGDは初期化手順に頼らず, より実用的で汎用性が高いことが示唆された。

関連論文リスト

Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization [19.32160757444549]
非特定行列分解の実例では、真の解体$のランクはしばしば未知であるため、モデルのランク子$は$r>rstar$として特異である。本稿では,行列センサの非行列因数分解のための安価なサブプロセッサを提案し,非依存の場合においても収束係数を線形に復元する。我々の数値実験により、PrecGD は他の変種非行列分解の収束を回復する上でも同様にうまく機能することがわかった。
論文参考訳（メタデータ） (2025-04-13T20:06:49Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
線形混合遷移カーネルを用いた最短経路(SSP)問題について検討する。エージェントは繰り返し環境と対話し、累積コストを最小化しながら特定の目標状態に到達する。既存の作業は、イテレーションコスト関数の厳密な下限や、最適ポリシーに対する期待長の上限を仮定することが多い。
論文参考訳（メタデータ） (2024-02-14T07:52:00Z)
Convergence of Gradient Descent with Small Initialization for Unregularized Matrix Completion [21.846732043706318]
バニラ勾配降下は、明示的な正則化を必要とせず、必ず基底真理$rmXstar$に収束することを示す。驚くべきことに、収束率も最終的な精度もオーバーパラメータ化された検索ランク$r'$に依存しておらず、それらは真のランク$r$によってのみ支配される。
論文参考訳（メタデータ） (2024-02-09T19:39:23Z)
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems [61.85150061213987]
一般化線形モデル (GLM) フレームワークを用いて, citelu2021low で提案した一般化低ランク行列帯域問題について検討する。既存のアルゴリズムの計算不可能性と理論的制約を克服するため,まずG-ESTTフレームワークを提案する。 G-ESTT は $tildeO(sqrt(d_1+d_2)3/2Mr3/2T)$ bound of regret を達成でき、G-ESTS は $tildeO を達成できることを示す。
論文参考訳（メタデータ） (2024-01-14T14:14:19Z)
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization [46.55524654398093]
過パラメータ化が降下の収束挙動をどのように変化させるかを示す。目的は、ほぼ等方的線形測定から未知の低ランクの地上構造行列を復元することである。本稿では,GDの一段階だけを修飾し,$alpha$に依存しない収束率を求める手法を提案する。
論文参考訳（メタデータ） (2023-10-03T03:34:22Z)
The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing [42.905196856926615]
$textsfScaledGD($lambda$)$は、低ランク行列センシング問題に取り組むための事前条件付き勾配降下法である。我々は、$textsfScaledGD($lambda$)$が、少数の反復の後、一定の線形速度で真の低ランク行列に収束することを示す。
論文参考訳（メタデータ） (2023-02-02T16:13:27Z)
Accelerated Single-Call Methods for Constrained Min-Max Optimization [5.266784779001398]
既存の方法は、各イテレーションで2つのグラデーションコールか2つのプロジェクションを必要とする。本稿では,RGOG(Optimistic Gradient)の変種が,非可換な min-max 収束率問題に富むことを示した。私たちの収束率は、自然や自然のような標準の尺度に当てはまる。
論文参考訳（メタデータ） (2022-10-06T17:50:42Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
縮退した線形マルコフ+デルタ決定における最適同定問題について, 生成モデルに基づく固定信頼度設定における検討を行った。複雑な非最適化プログラムの解としての下位境界は、そのようなアルゴリズムを考案する出発点として用いられる。
論文参考訳（メタデータ） (2022-08-11T04:12:50Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
プレコンディショニングは、行列ベクトル乗算を含む反復的な方法にとって非常に効果的なステップである。プレコンディショニングには、これまで検討されていなかった付加的なメリットがあることを実証する。基本的に無視可能なコストで、同時に分散を低減することができる。
論文参考訳（メタデータ） (2021-07-01T06:43:11Z)
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization [112.59170319105971]
問題に対処するための新しいアルゴリズム - Momentum- Single-timescale Approximation (MSTSA) を提案する。 MSTSAでは、低いレベルのサブプロブレムに対する不正確な解決策のため、反復でエラーを制御することができます。
論文参考訳（メタデータ） (2021-02-15T07:10:33Z)
A Simple Convergence Proof of Adam and Adagrad [74.24716715922759]
我々はAdam Adagradと$O(d(N)/st)$アルゴリズムの収束の証明を示す。 Adamはデフォルトパラメータで使用する場合と同じ収束$O(d(N)/st)$で収束する。
論文参考訳（メタデータ） (2020-03-05T01:56:17Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。