Fugu-MT 論文翻訳(概要): Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study

論文の概要: Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study

arxiv url: http://arxiv.org/abs/2602.10140v1
Date: Sun, 08 Feb 2026 19:56:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-12 21:44:01.168228
Title: Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study
Title（参考訳）: 大規模言語モデルはエージェントベースモデルを実装することができるか? ODDに基づくレプリケーションスタディ
Authors: Nuno Fachada, Daniel Fernandes, Carlos M. Fernandes, João P. Matos-Carvalho,
Abstract要約: 大規模言語モデル(LLM)は、テキスト記述から非自明な実行可能なコードを合成できるようになった。 LLMは、複製、検証、検証をサポートする方法で、標準化された仕様からエージェントベースのモデルを確実に実装できますか? 制御されたODD-to-code翻訳タスクにおいて17の現代LLMを評価する。
参考スコア（独自算出の注目度）: 0.6821122205224714
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) can now synthesize non-trivial executable code from textual descriptions, raising an important question: can LLMs reliably implement agent-based models from standardized specifications in a way that supports replication, verification, and validation? We address this question by evaluating 17 contemporary LLMs on a controlled ODD-to-code translation task, using the PPHPC predator-prey model as a fully specified reference. Generated Python implementations are assessed through staged executability checks, model-independent statistical comparison against a validated NetLogo baseline, and quantitative measures of runtime efficiency and maintainability. Results show that behaviorally faithful implementations are achievable but not guaranteed, and that executability alone is insufficient for scientific use. GPT-4.1 consistently produces statistically valid and efficient implementations, with Claude 3.7 Sonnet performing well but less reliably. Overall, the findings clarify both the promise and current limitations of LLMs as model engineering tools, with implications for reproducible agent-based and environmental modelling.
Abstract（参考訳）: LLMは、複製、検証、バリデーションをサポートする方法で、標準化された仕様からエージェントベースのモデルを確実に実装できますか? 我々は,PPHPCプレデター・プリーモデルを用いて,制御されたODD-to-code翻訳タスクにおいて17の現代LLMを評価することで,この問題に対処する。生成したPythonの実装は、ステージ化された実行可能性チェック、検証済みのNetLogoベースラインに対するモデル非依存の統計的比較、実行効率と保守性の定量的測定によって評価される。結果は、行動に忠実な実装は達成可能であるが保証されていないこと、そして実行可能性だけでは科学的な利用には不十分であることを示している。 GPT-4.1 は統計的に有効で効率的な実装であり、Claude 3.7 Sonnet は性能は良好だが信頼性は低い。本研究は, モデル工学ツールとしてのLLMの約束と現状の限界を明らかにするとともに, 再現性のあるエージェントベースおよび環境モデルの構築にも寄与することを示した。

論文の概要: Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study

関連論文リスト