Fugu-MT 論文翻訳(概要): Reinforcement Learning for Scalable and Trustworthy Intelligent Systems

論文の概要: Reinforcement Learning for Scalable and Trustworthy Intelligent Systems

arxiv url: http://arxiv.org/abs/2605.08378v1
Date: Fri, 08 May 2026 18:36:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:49.598494
Title: Reinforcement Learning for Scalable and Trustworthy Intelligent Systems
Title（参考訳）: スケーラブルで信頼性の高いインテリジェントシステムのための強化学習
Authors: Guangchen Lan,
Abstract要約: 強化学習はインテリジェントシステムの能力を向上させるための強力なパラダイムとなっている。次世代のインテリジェントシステムは、効率的な最適化と信頼できる振る舞いの両方を必要とします。この論文は、次世代のインテリジェントシステムは効率的な最適化と信頼できる振る舞いの両方を必要とすると主張している。
参考スコア（独自算出の注目度）: 2.1172256884504588
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning has become a powerful paradigm for improving the capability of intelligent systems, but its practical deployment faces two central challenges. First, reinforcement learning must scale efficiently in distributed environments where communication bandwidth is limited and computation is heterogeneous across agents. Second, as reinforcement learning is increasingly used in post-training large language models and autonomous agents, the optimized policies must also be aligned with human preferences and satisfy safety requirements such as privacy-aware information disclosure. This dissertation addresses both challenges through four complementary contributions spanning federated optimization, preference alignment, and contextual safety. The first part of the dissertation studies scalable reinforcement learning in federated settings. The second part of the dissertation studies trustworthy reinforcement learning for large language models. Together, these contributions advance reinforcement learning along two complementary dimensions. On the one hand, they make reinforcement learning more scalable through communication-efficient and asynchronous federated optimization. On the other hand, they make reinforcement learning more trustworthy by improving alignment with human preferences and by reducing contextually inappropriate information disclosure in language-based intelligent systems. As a whole, this dissertation argues that the next generation of intelligent systems will require both efficient optimization and trustworthy behavior, and that reinforcement learning provides a unifying framework for addressing both goals.
Abstract（参考訳）: 強化学習はインテリジェントシステムの能力を向上させるための強力なパラダイムとなっているが、その実践的な展開は2つの中心的な課題に直面している。第一に、強化学習は、通信帯域が制限され、計算がエージェント間で不均一な分散環境で効率よくスケールする必要がある。第二に、強化学習が大規模言語モデルや自律エージェントのポストトレーニングにますます使われているため、最適化されたポリシーは人間の嗜好に適合し、プライバシーに配慮した情報開示のような安全要件を満たす必要がある。この論文は、フェデレーション最適化、優先順位調整、コンテキスト安全性にまたがる4つの補完的な貢献を通じて、両方の課題に対処する。論文の第1部では、フェデレートされた環境での強化学習をスケーラブルに研究している。論文の第2部は、大規模言語モデルのための信頼できる強化学習である。これらの貢献は2つの相補的な側面に沿って強化学習を促進する。一方、コミュニケーション効率と非同期のフェデレーション最適化によって強化学習をよりスケーラブルにする。一方で,人間の嗜好との整合性を向上し,文脈的に不適切な情報開示を減らすことで,強化学習をより信頼できるものにする。全体として、この論文は、次世代のインテリジェントシステムには、効率的な最適化と信頼できる行動の両方が必要であり、強化学習は両方の目標に対処するための統一的なフレームワークを提供する、と論じている。

論文の概要: Reinforcement Learning for Scalable and Trustworthy Intelligent Systems

関連論文リスト