Fugu-MT 論文翻訳(概要): Unifying On- and Off-Policy Variance Reduction Methods

論文の概要: Unifying On- and Off-Policy Variance Reduction Methods

arxiv url: http://arxiv.org/abs/2603.08370v1
Date: Mon, 09 Mar 2026 13:32:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:16.092147
Title: Unifying On- and Off-Policy Variance Reduction Methods
Title（参考訳）: オン・アンド・オフ・プライシ・バリアンス・リダクションの統一化
Authors: Olivier Jeunen,
Abstract要約: オンラインの標準差分平均推定器は,非政治的逆比重推定器と数学的に同一であることを示す。この統合を拡張して、広範回帰補正法は2倍ロバスト推定と構造的に等価であることを示す。
参考スコア（独自算出の注目度）: 8.291484471359633
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continuous and efficient experimentation is key to the practical success of user-facing applications on the web, both through online A/B-tests and off-policy evaluation. Despite their shared objective -- estimating the incremental value of a treatment -- these domains often operate in isolation, utilising distinct terminologies and statistical toolkits. This paper bridges that divide by establishing a formal equivalence between their canonical variance reduction methods. We prove that the standard online Difference-in-Means estimator is mathematically identical to an off-policy Inverse Propensity Scoring estimator equipped with an optimal (variance-minimising) additive control variate. Extending this unification, we demonstrate that widespread regression adjustment methods (such as CUPED, CUPAC, and ML-RATE) are structurally equivalent to Doubly Robust estimation. This unified view extends our understanding of commonly used approaches, and can guide practitioners and researchers working on either class of problems.
Abstract（参考訳）: オンラインA/Bテストと非政治評価の両方を通じて、Web上のユーザ向けアプリケーションの実践的成功の鍵は、継続的かつ効率的な実験である。彼らの共通の目的(治療の漸進的な価値を見積もる)にもかかわらず、これらのドメインは独立して動作し、異なる用語と統計ツールキットを利用する。本論文は, 正準分散低減法間の形式的等価性を確立することによって分割する橋梁について述べる。オンラインの標準差分平均推定器は, 最適(分散最小化)加法制御変数を備えた, オフポリティな逆不等式スコアリング推定器と数学的に同一であることが証明された。この統合を拡張して、CUPED、CUPAC、ML-RATEなどの広範囲な回帰調整手法が二重ロバスト推定と構造的に等価であることを示す。この統一された見解は、一般的に使われているアプローチの理解を拡張し、実践者や研究者がどちらの種類の問題に取り組むかをガイドする。

論文の概要: Unifying On- and Off-Policy Variance Reduction Methods

関連論文リスト