Fugu-MT 論文翻訳(概要): An Introduction to Causal Reinforcement Learning

論文の概要: An Introduction to Causal Reinforcement Learning

arxiv url: http://arxiv.org/abs/2606.24160v1
Date: Tue, 23 Jun 2026 05:28:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-24 22:16:48.790793
Title: An Introduction to Causal Reinforcement Learning
Title（参考訳）: 因果強化学習入門
Authors: Elias Bareinboim, Junzhe Zhang, Sanghack Lee,
Abstract要約: 因果推論と強化学習は、同じビルディングブロックの異なる側面、すなわち反実的関係で機能する。私たちは、オンライン、オフ政治、因果計算学習など、さまざまな学習方法の統一的な治療を受けました。具体的には、政策学習を一般化した因果レンズを用いて、介入すべき場所、模倣学習、反事実学習を紹介し、議論する。
参考スコア（独自算出の注目度）: 58.680653905480284
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Causal inference provides a set of principles and tools that allow one to combine data and knowledge about an environment to reason with questions of counterfactual nature, i.e., what would have happened had reality been different, even when no data of this unrealized reality is currently available. Reinforcement learning provides methods to learn a policy that optimizes a specific measure (e.g., reward, regret) when the agent is deployed in an environment and pursues an exploratory, trial-and-error approach. These two disciplines have evolved independently and with virtually no interaction between them. We note that they operate over different aspects of the same building block, counterfactual relations, which makes them umbilically connected. Based on these observations, novel learning opportunities arise when this connection is explicitly acknowledged and mathematized. To realize this potential, we note that any environment where the RL agent is deployed can be decomposed as a collection of autonomous mechanisms with different causal invariances, parsimoniously modeled as a structural causal model; any standard RL setting implicitly encodes such a model. This formalization allows us to put under a unifying treatment different modes of learning, including online, off-policy, and causal calculus learning, which appear unrelated in the literature. However, these modalities are not exhaustive: we introduce several natural and pervasive classes of learning settings that entail novel dimensions of analysis. Specifically, we introduce and discuss through causal lenses generalized policy learning, where to intervene, imitation learning, and counterfactual learning. These tasks lead to a broader view of counterfactual learning and suggest great potential for studying causal inference and reinforcement learning side by side, which we call causal reinforcement learning (CRL).
Abstract（参考訳）: 因果推論(英: Causal inference)は、ある環境に関するデータと知識を組み合わせて、反現実的な性質の疑問、すなわち、この非現実的な現実のデータが現在利用可能である場合でも、何が起こったかが異なっていたかどうかを推論するための原則とツールのセットを提供する。強化学習は、エージェントが環境にデプロイされたときに特定の尺度(例えば報酬、後悔)を最適化し、探索的で試行錯誤的なアプローチを追求するポリシーを学ぶ方法を提供する。これら2つの分野は独立して進化し、両者の相互作用はほとんどない。それらが同じビルディングブロックの異なる側面、反ファクト関係で動作していることに注意してください。これらの観察に基づいて、この関係が明確に認識され、数学的に認識されると、新しい学習機会が生まれる。この可能性を実現するために、RLエージェントがデプロイされる環境は、異なる因果的不変性を持つ自律的なメカニズムの集合として分解でき、構造因果モデルとしてパロモニカルにモデル化され、任意の標準RL設定は、そのようなモデルを暗黙的にエンコードする。この形式化によって、オンライン、オフ政治、因果計算学習など、さまざまな学習方法の統一化が可能になります。しかし、これらのモダリティは徹底的ではない。我々は、新しい分析の次元を包含する、自然で広範に広がる学習環境のクラスをいくつか導入する。具体的には、政策学習を一般化した因果レンズを用いて、介入すべき場所、模倣学習、反事実学習を紹介し、議論する。これらの課題から, 因果推論と強化学習を並べて研究する大きな可能性を示唆し, 因果強化学習(CRL)と呼ぶ。

論文の概要: An Introduction to Causal Reinforcement Learning

関連論文リスト