Fugu-MT 論文翻訳(概要): Improving Computational Complexity in Statistical Models with Second-Order Information

論文の概要: Improving Computational Complexity in Statistical Models with Second-Order Information

arxiv url: http://arxiv.org/abs/2202.04219v2
Date: Thu, 10 Feb 2022 03:09:31 GMT
ステータス: 翻訳完了
システム内更新日: 2022-02-11 13:02:38.352454
Title: Improving Computational Complexity in Statistical Models with Second-Order Information
Title（参考訳）: 2次情報を用いた統計モデルの計算複雑性の向上
Authors: Tongzheng Ren and Jiacheng Zhuo and Sujay Sanghavi and Nhat Ho
Abstract要約: パラメトリック統計モデルにおけるパラメータ推定のための正規化勾配降下法(NormGD)について検討する。我々は、NormGDアルゴリズムが最終的な統計的半径に達するために最適な計算量$mathcalO(n)$を達成することを示した。この計算複雑性は、固定ステップサイズ勾配勾配アルゴリズムよりも安価である。
参考スコア（独自算出の注目度）: 32.64382185990981
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is known that when the statistical models are singular, i.e., the Fisher information matrix at the true parameter is degenerate, the fixed step-size gradient descent algorithm takes polynomial number of steps in terms of the sample size $n$ to converge to a final statistical radius around the true parameter, which can be unsatisfactory for the application. To further improve that computational complexity, we consider the utilization of the second-order information in the design of optimization algorithms. Specifically, we study the normalized gradient descent (NormGD) algorithm for solving parameter estimation in parametric statistical models, which is a variant of gradient descent algorithm whose step size is scaled by the maximum eigenvalue of the Hessian matrix of the empirical loss function of statistical models. When the population loss function, i.e., the limit of the empirical loss function when $n$ goes to infinity, is homogeneous in all directions, we demonstrate that the NormGD iterates reach a final statistical radius around the true parameter after a logarithmic number of iterations in terms of $n$. Therefore, for fixed dimension $d$, the NormGD algorithm achieves the optimal overall computational complexity $\mathcal{O}(n)$ to reach the final statistical radius. This computational complexity is cheaper than that of the fixed step-size gradient descent algorithm, which is of the order $\mathcal{O}(n^{\tau})$ for some $\tau > 1$, to reach the same statistical radius. We illustrate our general theory under two statistical models: generalized linear models and mixture models, and experimental results support our prediction with general theory.
Abstract（参考訳）: 統計モデルが特異である場合、すなわち、真のパラメータのフィッシャー情報行列が縮退すると、固定されたステップサイズ勾配降下アルゴリズムは、実パラメータの周りの最終的な統計半径に収束するために、サンプルサイズ$n$の項で多項式数のステップを取る。計算複雑性をさらに改善するため,最適化アルゴリズムの設計における2次情報の利用を検討する。具体的には,統計モデルの経験的損失関数のヘッセン行列の最大固有値を用いて,ステップサイズをスケールした勾配降下アルゴリズムの変種であるパラメトリック統計モデルのパラメータ推定のための正規化勾配降下(NormGD)アルゴリズムについて検討する。集団損失関数、すなわち$n$が無限大になるときの経験的損失関数の極限がすべての方向に均質であるとき、NormGD の反復は$n$の対数的な反復数の後、真のパラメータの周りの最終的な統計的半径に達することを示した。したがって、固定次元$d$の場合、ノルムGDアルゴリズムは最終的な統計半径に達するために最適な計算複雑性$\mathcal{O}(n)$を達成する。この計算複雑性は、幾らかの$\tau > 1$に対して$\mathcal{O}(n^{\tau})$の次数である固定ステップサイズ勾配勾配アルゴリズムよりも低く、同じ統計半径に達する。一般化線形モデルと混合モデルという2つの統計モデルの下での一般理論を示し, 一般理論による予測を実験的に支持する。

関連論文リスト

Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
単次元モデル(Single-Index Models)は、植木構造における高次元回帰問題である。我々は,統計的クエリ (SQ) と低遅延多項式 (LDP) フレームワークの両方において,計算効率のよいアルゴリズムが必ずしも$Omega(dkstar/2)$サンプルを必要とすることを示した。
論文参考訳（メタデータ） (2024-03-08T18:50:19Z)
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models [69.22568644711113]
我々は,モンテカルロサンプリングと反復線形解法を組み合わせた確率的アンローリングを導入し,行列逆転を回避した。理論的解析により,解法の繰り返しによる解法の解法と逆転が最大値推定の勾配推定を高速化することを示した。シミュレーションおよび実データ実験において、確率的アンロールは、モデル性能の損失を最小限に抑えながら、勾配EMよりも桁違いに高速な潜在ガウスモデルを学習することを示した。
論文参考訳（メタデータ） (2023-06-05T21:08:34Z)
An Exponentially Increasing Step-size for Parameter Estimation in Statistical Models [37.63410634069547]
本稿では,ガウス降下(GD)アルゴリズムのステップサイズを指数関数的に増加させることを提案する。次に、非正規統計モデルの下でパラメータ推定を解くためのEGDアルゴリズムについて検討する。 EGDアルゴリズムの総計算複雑性は、非正則統計モデルにおけるパラメータ推定の解法として、GDよりも最適で指数関数的に安価である。
論文参考訳（メタデータ） (2022-05-16T21:36:22Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
本研究は,ステップサイズの減衰が正確な収束に必要であるという事実と,一定のステップサイズがエラーまでの時間でより速く学習するという事実のバランスをとることを目的とする。ステップサイズのミニバッチを最初から修正するのではなく,パラメータを適応的に進化させることを提案する。
論文参考訳（メタデータ） (2020-07-02T16:02:02Z)
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model [50.38446482252857]
本稿では、生成モデル(シミュレータ)へのアクセスを想定して、強化学習のサンプル効率について検討する。最初に$gamma$-discounted infinite-horizon Markov decision process (MDPs) with state space $mathcalS$ and action space $mathcalA$を考える。対象の精度を考慮すれば,モデルに基づく計画アルゴリズムが最小限のサンプルの複雑さを実現するのに十分であることを示す。
論文参考訳（メタデータ） (2020-05-26T17:53:18Z)
Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
我々は,人口レベルでのアルゴリズムの決定論的収束率と,$n$サンプルに基づく経験的対象に適用した場合の(不安定性)の間の相互作用に基づいて,統計的精度を得るフレームワークを開発する。本稿では,ガウス混合推定,非線形回帰モデル,情報的非応答モデルなど,いくつかの具体的なモデルに対する一般結果の応用について述べる。
論文参考訳（メタデータ） (2020-05-22T22:30:52Z)
Dual Stochastic Natural Gradient Descent and convergence of interior half-space gradient approximations [0.0]
多項ロジスティック回帰(MLR)は統計学や機械学習で広く使われている。勾配降下(SGD)は、ビッグデータシナリオにおけるMLRモデルのパラメータを決定する最も一般的な手法である。
論文参考訳（メタデータ） (2020-01-19T00:53:49Z)
Statistical Inference for Model Parameters in Stochastic Gradient Descent [45.29532403359099]
勾配降下係数(SGD)は,その計算効率とメモリ効率から,大規模データの統計的推定に広く用いられている。人口減少関数が強い凸であり,一定の条件を満たす場合,SGDに基づく真のモデルパラメータの統計的推測の問題について検討する。
論文参考訳（メタデータ） (2016-10-27T07:04:21Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。