Fugu-MT 論文翻訳(概要): Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

論文の概要: Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

arxiv url: http://arxiv.org/abs/2107.10211v1
Date: Wed, 21 Jul 2021 17:10:14 GMT
ステータス: 翻訳完了
システム内更新日: 2021-07-22 14:22:16.856328
Title: Differentiable Annealed Importance Sampling and the Perils of Gradient Noise
Title（参考訳）: 微分可能なアニール化重要度サンプリングと勾配雑音のペリル
Authors: Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, Roger Grosse
Abstract要約: Annealed importance sample (AIS) と関連するアルゴリズムは、限界推定のための非常に効果的なツールである。差別性は、目的として限界確率を最適化する可能性を認めるため、望ましい性質である。我々はメトロポリス・ハスティングスのステップを放棄して微分可能アルゴリズムを提案し、ミニバッチ計算をさらに解き放つ。
参考スコア（独自算出の注目度）: 68.44523807580438
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation, but are not fully differentiable due to the use of Metropolis-Hastings (MH) correction steps. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective using gradient-based methods. To this end, we propose a differentiable AIS algorithm by abandoning MH steps, which further unlocks mini-batch computation. We provide a detailed convergence analysis for Bayesian linear regression which goes beyond previous analyses by explicitly accounting for non-perfect transitions. Using this analysis, we prove that our algorithm is consistent in the full-batch setting and provide a sublinear convergence rate. However, we show that the algorithm is inconsistent when mini-batch gradients are used due to a fundamental incompatibility between the goals of last-iterate convergence to the posterior and elimination of the pathwise stochastic error. This result is in stark contrast to our experience with stochastic optimization and stochastic gradient Langevin dynamics, where the effects of gradient noise can be washed out by taking more steps of a smaller size. Our negative result relies crucially on our explicit consideration of convergence to the stationary distribution, and it helps explain the difficulty of developing practically effective AIS-like algorithms that exploit mini-batch gradients.
Abstract（参考訳）: annealed importance sampling (ais) と関連するアルゴリズムは、限界確率推定に非常に効果的なツールであるが、メトロポリス・ハスティング (mh) の補正ステップによって完全には区別できない。微分可能性(英: differentiability)は、勾配に基づく手法を用いて、目標として限界可能性を最適化する可能性を認めるため、望ましい性質である。そこで本研究では,MH ステップを廃止した微分可能なAISアルゴリズムを提案し,さらにミニバッチ計算を解き放つ。ベイズ線形回帰の詳細な収束解析を,非完全遷移を明示的に計算することにより,従来の解析を超越する。この分析により,本アルゴリズムは全バッチ設定で一貫したものであり,線形収束率が得られることを示す。しかし,このアルゴリズムは,後段へのラストイテレート収束の目標とパスワイズ確率誤差の除去との間に根本的な不整合があるため,ミニバッチ勾配を用いると矛盾することを示した。この結果は、我々の確率的最適化と確率的勾配ランジュバンダイナミクスの経験とは全く対照的で、グラデーションノイズの影響はより小さなサイズのステップを踏むことで洗い流すことができる。我々の負の結果は、定常分布への収束の明示的な考察に大きく依存しており、ミニバッチ勾配を利用する実用的なAISライクなアルゴリズムを開発することの難しさを説明するのに役立ちます。

関連論文リスト

Quantitative Error Bounds for Scaling Limits of Stochastic Iterative Algorithms [10.022615790746466]
アルゴリズムのサンプルパスとOrnstein-Uhlenbeck近似の非漸近関数近似誤差を導出する。我々は、L'evy-Prokhorov と有界ワッサーシュタイン距離という2つの一般的な測度で誤差境界を構築するために、主要な結果を使用する。
論文参考訳（メタデータ） (2025-01-21T15:29:11Z)
Robust Stochastic Optimization via Gradient Quantile Clipping [6.2844649973308835]
グラディエントDescent(SGD)のための量子クリッピング戦略を導入する。通常のクリッピングチェーンとして、グラデーション・ニュー・アウトリージを使用します。本稿では,Huberiles を用いたアルゴリズムの実装を提案する。
論文参考訳（メタデータ） (2023-09-29T15:24:48Z)
Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent [43.097493761380186]
勾配アルゴリズムは線形系を解くのに有効な方法である。最適値に収束しない場合であっても,勾配降下は正確な予測を導出することを示す。実験的に、勾配降下は十分に大規模または不条件の回帰タスクにおいて最先端の性能を達成する。
論文参考訳（メタデータ） (2023-06-20T15:07:37Z)
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance [59.211456992422136]
制約の少ない仮定の下で高確率収束結果のアルゴリズムを提案する。これらの結果は、標準機能クラスに適合しない問題を最適化するために検討された手法の使用を正当化する。
論文参考訳（メタデータ） (2023-02-02T10:37:23Z)
Convergence of the mini-batch SIHT algorithm [0.0]
Iterative Hard Thresholding (IHT)アルゴリズムはスパース最適化の効果的な決定論的アルゴリズムとして広く検討されている。スパースミニバッチSIHTが生成したシーケンスはスーパーマーチンゲールシーケンスであり、確率1と収束することを示す。
論文参考訳（メタデータ） (2022-09-29T03:47:46Z)
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond [63.59034509960994]
シャッフルに基づく変種(ミニバッチと局所ランダムリシャッフル)について検討する。ポリアック・ロジャシエヴィチ条件を満たす滑らかな函数に対して、これらのシャッフル型不変量(英語版)(shuffling-based variants)がそれらの置換式よりも早く収束することを示す収束境界を得る。我々は, 同期シャッフル法と呼ばれるアルゴリズムの修正を提案し, ほぼ均一な条件下では, 下界よりも収束速度が速くなった。
論文参考訳（メタデータ） (2021-10-20T02:25:25Z)
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging [96.13485146617322]
本稿では, ステップサイズが一定であるSEG法の解析を行い, 良好な収束をもたらす手法のバリエーションを示す。平均化で拡張した場合、SEGはナッシュ平衡に確実に収束し、スケジュールされた再起動手順を組み込むことで、その速度が確実に加速されることを証明した。
論文参考訳（メタデータ） (2021-06-30T17:51:36Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
アルゴリズムが高い確率で小さな客観的残差を与えることを理論的に保証することが不可欠である。非滑らか凸最適化の既存の方法は、信頼度に依存した複雑性境界を持つ。そこで我々は,勾配クリッピングを伴う2つの手法に対して,新たなステップサイズルールを提案する。
論文参考訳（メタデータ） (2021-06-10T17:54:21Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
この作業は、一階情報を必要としない零次最適化(ZO)の反復である。座標重要度サンプリングにおける優雅な設計により,ZO最適化法は複雑度と関数クエリコストの両面において効率的であることを示す。
論文参考訳（メタデータ） (2020-12-21T17:29:58Z)
Amortized variance reduction for doubly stochastic objectives [17.064916635597417]
複素確率モデルにおける近似推論は二重目的関数の最適化を必要とする。現在のアプローチでは、ミニバッチがサンプリング性にどのように影響するかを考慮せず、結果として準最適分散が減少する。本稿では,認識ネットワークを用いて各ミニバッチに対して最適な制御変数を安価に近似する手法を提案する。
論文参考訳（メタデータ） (2020-03-09T13:23:14Z)
Non-asymptotic bounds for stochastic optimization with biased noisy gradient oracles [8.655294504286635]
関数の測定値が推定誤差を持つ設定を捉えるために,バイアス付き勾配オラクルを導入する。提案するオラクルは,例えば,独立分散シミュレーションと同一分散シミュレーションのバッチによるリスク計測推定の実践的な状況にある。
論文参考訳（メタデータ） (2020-02-26T12:53:04Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。