Fugu-MT 論文翻訳(概要): Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications

論文の概要: Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications

arxiv url: http://arxiv.org/abs/2604.04334v2
Date: Fri, 10 Apr 2026 04:01:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 13:51:27.570563
Title: Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications
Title（参考訳）: 分散強化学習の強化:分析と医療応用
Authors: Zequn Chen, Wesley J. Marrero,
Abstract要約: 本稿では,ロボット工学や医療といった複雑な領域における意思決定を最適化するための分散強化学習アルゴリズムを提案する。心血管疾患リスクグループに個人を分類することで,米国の成人人口の多大サブセットにおける高血圧管理に本アルゴリズムを適用した。
参考スコア（独自算出の注目度）: 0.8348593305367524
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Researchers and practitioners are increasingly considering reinforcement learning to optimize decisions in complex domains like robotics and healthcare. To date, these efforts have largely utilized expectation-based learning. However, relying on expectation-focused objectives may be insufficient for making consistent decisions in highly uncertain situations involving multiple heterogeneous groups. While distributional reinforcement learning algorithms have been introduced to model the full distributions of outcomes, they can yield large discrepancies in realized benefits among comparable agents. This challenge is particularly acute in healthcare settings, where physicians (controllers) must manage multiple patients (subordinate agents) with uncertain disease progression and heterogeneous treatment responses. We propose a Boosted Distributional Reinforcement Learning (BDRL) algorithm that optimizes agent-specific outcome distributions while enforcing comparability among similar agents and analyze its convergence. To further stabilize learning, we incorporate a post-update projection step formulated as a constrained convex optimization problem, which efficiently aligns individual outcomes with a high-performing reference within a specified tolerance. We apply our algorithm to manage hypertension in a large subset of the US adult population by categorizing individuals into cardiovascular disease risk groups. Our approach modifies treatment plans for median and vulnerable patients by mimicking the behavior of high-performing references in each risk group. Furthermore, we find that BDRL improves the number and consistency of quality-adjusted life years compared with reinforcement learning baselines.
Abstract（参考訳）: 研究者や実践者は、ロボット工学や医療といった複雑な領域における意思決定を最適化するために強化学習を検討している。これまでのところ、これらの取り組みは期待に基づく学習を大いに活用してきた。しかし、予想に焦点を絞った目的に頼ることは、複数の異種集団を含む非常に不確実な状況において一貫した決定を下すには不十分である。結果の完全な分布をモデル化するために分散強化学習アルゴリズムが導入されたが、それと同等のエージェント間で実現された利益において大きな相違が生じる可能性がある。この課題は、医師(制御装置)が不確実な疾患の進行と不均一な治療反応で複数の患者(従属エージェント)を管理する必要がある医療環境では特に深刻である。エージェント固有の結果分布を最適化し,類似エージェント間のコンパビリティを向上し,その収束度を解析するBDRL(Boosted Distributional Reinforcement Learning)アルゴリズムを提案する。さらに学習を安定させるために,制約付き凸最適化問題として定式化された更新後の投影ステップを組み込んだ。心血管疾患リスクグループに個人を分類することで,米国の成人人口の多大サブセットにおける高血圧管理に本アルゴリズムを適用した。リスクグループごとのハイパフォーマンス参照の振る舞いを模倣し,中等度および弱度患者に対する治療計画を変更する。さらに,BDRLは,強化学習ベースラインと比較して,品質調整寿命の数と一貫性を向上することがわかった。

論文の概要: Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications

関連論文リスト