Fugu-MT 論文翻訳(概要): LLM-in-Sandbox Elicits General Agentic Intelligence

論文の概要: LLM-in-Sandbox Elicits General Agentic Intelligence

arxiv url: http://arxiv.org/abs/2601.16206v1
Date: Thu, 22 Jan 2026 18:57:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-23 21:37:20.699929
Title: LLM-in-Sandbox Elicits General Agentic Intelligence
Title（参考訳）: LLM-in-Sandboxが一般エージェントインテリジェンスを緩和
Authors: Daixuan Cheng, Shaohan Huang, Yuxian Gu, Huatong Song, Guoxin Chen, Li Dong, Wayne Xin Zhao, Ji-Rong Wen, Furu Wei,
Abstract要約: 我々はLLM-in-Sandboxを導入し、LLMがコードサンドボックス(仮想コンピュータ)内で探索し、非コードドメインの汎用インテリジェンスを引き出すことを可能にする。コードサンドボックスを非コードタスクに活用するための一般化機能を示す。実験により、LLM-in-Sandboxは、無訓練と後訓練の両方の環境で、数学、物理学、化学、生医学、長文理解、そして次の指示にまたがる堅牢な一般化を実現することが示された。
参考スコア（独自算出の注目度）: 142.7174116109795
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce LLM-in-Sandbox, enabling LLMs to explore within a code sandbox (i.e., a virtual computer), to elicit general intelligence in non-code domains. We first demonstrate that strong LLMs, without additional training, exhibit generalization capabilities to leverage the code sandbox for non-code tasks. For example, LLMs spontaneously access external resources to acquire new knowledge, leverage the file system to handle long contexts, and execute scripts to satisfy formatting requirements. We further show that these agentic capabilities can be enhanced through LLM-in-Sandbox Reinforcement Learning (LLM-in-Sandbox-RL), which uses only non-agentic data to train models for sandbox exploration. Experiments demonstrate that LLM-in-Sandbox, in both training-free and post-trained settings, achieves robust generalization spanning mathematics, physics, chemistry, biomedicine, long-context understanding, and instruction following. Finally, we analyze LLM-in-Sandbox's efficiency from computational and system perspectives, and open-source it as a Python package to facilitate real-world deployment.
Abstract（参考訳）: 我々はLLM-in-Sandboxを導入し、LLMがコードサンドボックス(仮想コンピュータ)内で探索し、非コードドメインの汎用インテリジェンスを引き出すことを可能にする。我々はまず、強化されたLLMが非コードタスクにコードサンドボックスを利用するための一般化機能を示すことを実証した。例えば、LLMは外部リソースに自発的にアクセスして新しい知識を取得し、ファイルシステムを利用して長いコンテキストを処理し、フォーマット要求を満たすスクリプトを実行します。さらに、これらのエージェント機能は、サンドボックス探索のためのモデルトレーニングに非エージェントデータのみを使用するLLM-in-Sandbox Reinforcement Learning(LLM-in-Sandbox-RL)によって拡張可能であることを示す。実験により、LLM-in-Sandboxは、無訓練と後訓練の両方の環境で、数学、物理学、化学、生医学、長文理解、そして次の指示にまたがる堅牢な一般化を実現することが示された。最後に、計算およびシステムの観点からLLM-in-Sandboxの効率性を解析し、実世界のデプロイを容易にするためにPythonパッケージとしてオープンソース化する。

論文の概要: LLM-in-Sandbox Elicits General Agentic Intelligence

関連論文リスト