Fugu-MT 論文翻訳(概要): Markov Chain Variance Estimation: A Stochastic Approximation Approach

論文の概要: Markov Chain Variance Estimation: A Stochastic Approximation Approach

arxiv url: http://arxiv.org/abs/2409.05733v1
Date: Mon, 9 Sep 2024 15:42:28 GMT
ステータス: 翻訳完了
システム内更新日: 2024-09-10 14:06:46.361019
Title: Markov Chain Variance Estimation: A Stochastic Approximation Approach
Title（参考訳）: マルコフ連鎖変動推定法 : 確率近似法
Authors: Shubhada Agrawal, Prashanth L. A., Siva Theja Maguluri,
Abstract要約: マルコフ連鎖上で定義された関数の計算分散を推定する問題を考える。我々は、各ステップで$O(1)$の保存を必要とする最初の再帰的推定器を設計する。平均報酬強化学習における推定器の応用について述べる。
参考スコア（独自算出の注目度）: 14.883782513177094
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider the problem of estimating the asymptotic variance of a function defined on a Markov chain, an important step for statistical inference of the stationary mean. We design the first recursive estimator that requires $O(1)$ computation at each step, does not require storing any historical samples or any prior knowledge of run-length, and has optimal $O(\frac{1}{n})$ rate of convergence for the mean-squared error (MSE) with provable finite sample guarantees. Here, $n$ refers to the total number of samples generated. The previously best-known rate of convergence in MSE was $O(\frac{\log n}{n})$, achieved by jackknifed estimators, which also do not enjoy these other desirable properties. Our estimator is based on linear stochastic approximation of an equivalent formulation of the asymptotic variance in terms of the solution of the Poisson equation. We generalize our estimator in several directions, including estimating the covariance matrix for vector-valued functions, estimating the stationary variance of a Markov chain, and approximately estimating the asymptotic variance in settings where the state space of the underlying Markov chain is large. We also show applications of our estimator in average reward reinforcement learning (RL), where we work with asymptotic variance as a risk measure to model safety-critical applications. We design a temporal-difference type algorithm tailored for policy evaluation in this context. We consider both the tabular and linear function approximation settings. Our work paves the way for developing actor-critic style algorithms for variance-constrained RL.
Abstract（参考訳）: マルコフ連鎖上で定義される関数の漸近的分散を推定する問題は、定常平均の統計的推測の重要なステップである。我々は各ステップで$O(1)$計算を必要とする最初の再帰的推定器を設計し、履歴サンプルやラン長に関する事前知識を保存する必要がなく、証明可能な有限標本保証付き平均二乗誤差(MSE)に対する最適$O(\frac{1}{n})$収束率を持つ。ここで、$n$は生成されたサンプルの総数を指す。以前はMSEの収束率が最もよく知られていたのは、ジャックニフ付き推定器によって達成された$O(\frac{\log n}{n})$であり、これら他の望ましい性質も享受していない。我々の推定子は、ポアソン方程式の解の項による漸近分散の等価な定式化の線形確率近似に基づいている。我々は,ベクトル値関数の共分散行列の推定,マルコフ鎖の定常分散の推定,および基礎となるマルコフ鎖の状態空間が大きくなるような条件下での漸近分散の推定など,いくつかの方向の近似器を一般化する。また, 平均報酬強化学習(RL)における推定器の応用について述べる。この文脈でポリシー評価に適した時間差型アルゴリズムを設計する。表型および線形関数近似の設定について検討する。我々の研究は、分散制約付きRLのためのアクター・クリティカルなスタイルのアルゴリズムを開発するための道を開いた。

関連論文リスト

Stabilizing Fixed-Point Iteration for Markov Chain Poisson Equations [49.702772230127465]
有限状態マルコフ鎖を$n$状態と遷移行列$P$で研究する。すべての非退化モードが実周辺不変部分空間 $mathcalK(P)$ によってキャプチャされ、商空間 $mathbbRn/mathcalK(P) 上の誘導作用素が厳密に収縮し、ユニークな商解が得られることを示す。
論文参考訳（メタデータ） (2026-01-31T02:57:01Z)
Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning [55.197497603087065]
線形関数近似を用いた時間差分学習アルゴリズムの性能解析を行った。マルコフ連鎖によって誘導されるベクトル値マルティンタに対する新規で一般的な高次元濃度不等式とベリー-エッセイン境界を確立する。
論文参考訳（メタデータ） (2025-02-19T15:33:55Z)
Online Covariance Estimation in Nonsmooth Stochastic Approximation [14.818683408659764]
非滑らかな変分包含問題を解くために近似法(SA)を適用することを検討する。我々の収束構造は、統計的推定法で最もよく知られているものを確立する。
論文参考訳（メタデータ） (2025-02-07T20:16:51Z)
Semiparametric conformal prediction [79.6147286161434]
ベクトル値の非整合性スコアの結合相関構造を考慮した共形予測セットを構築する。スコアの累積分布関数(CDF)を柔軟に推定する。提案手法は,現実の回帰問題に対して,所望のカバレッジと競争効率をもたらす。
論文参考訳（メタデータ） (2024-11-04T14:29:02Z)
Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
高次元極限における正規化凸分類器の一般クラスの挙動について検討する。我々の焦点は、推定器の一般化誤差と変数選択性である。
論文参考訳（メタデータ） (2024-10-25T19:58:36Z)
Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry [0.0]
対称正定値多様体の対数ユークリッド幾何学を利用する共分散行列の多値推定器を導入する。固定予算が与えられた推定器の平均二乗誤差を最小化する最適サンプル割り当て方式を開発した。物理アプリケーションからのデータによるアプローチの評価は、ベンチマークと比較すると、より正確なメトリック学習と1桁以上のスピードアップを示している。
論文参考訳（メタデータ） (2023-01-31T16:33:46Z)
Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
制約付き非線形最適化問題のオンライン統計的推測を考察する。これらの問題を解決するために、逐次二次計画法(StoSQP)を適用する。
論文参考訳（メタデータ） (2022-05-27T00:34:03Z)
Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
可分バナッハ空間上で定義された収縮作用素の定点を推定する問題について検討する。演算子欠陥と推定誤差の両方に対して漸近的でない境界を確立する。
論文参考訳（メタデータ） (2022-01-21T02:46:57Z)
Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
カーネルヒルベルト空間を用いて、無限水平割引マルコフ報酬過程の値関数を推定する。我々は、関連するカーネル演算子の固有値に明示的に依存した誤差の非漸近上界を導出する。 MRP のサブクラスに対する minimax の下位境界を証明する。
論文参考訳（メタデータ） (2021-09-24T14:48:20Z)
Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
ストリーミング$p$のサンプルから重み付き統計推定の課題を考察する。そこで我々は,傾きの雑音に対して,よりニュアンスな条件下での傾きの傾きの低下を設計し,より詳細な解析を行う。
論文参考訳（メタデータ） (2021-08-25T21:30:27Z)
$\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator [95.71091446753414]
最寄りの$gamma$-divergence推定器をデータ差分尺度として用いることを提案する。本手法は既存の不一致対策よりも高いロバスト性を実現する。
論文参考訳（メタデータ） (2020-06-13T06:09:27Z)
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration [115.1954841020189]
The inequality and non-asymptotic properties of approximation procedure with Polyak-Ruppert averaging。一定のステップサイズと無限大となる反復数を持つ平均的反復数に対する中心極限定理(CLT)を証明する。
論文参考訳（メタデータ） (2020-04-09T17:54:18Z)
Statistical Inference for Model Parameters in Stochastic Gradient Descent [45.29532403359099]
勾配降下係数(SGD)は,その計算効率とメモリ効率から,大規模データの統計的推定に広く用いられている。人口減少関数が強い凸であり,一定の条件を満たす場合,SGDに基づく真のモデルパラメータの統計的推測の問題について検討する。
論文参考訳（メタデータ） (2016-10-27T07:04:21Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。