Fugu-MT 論文翻訳(概要): Benchmarking LLMs for Community Governance Simulation with Life-history Narratives

論文の概要: Benchmarking LLMs for Community Governance Simulation with Life-history Narratives

arxiv url: http://arxiv.org/abs/2605.23783v1
Date: Fri, 22 May 2026 15:48:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 17:29:20.419951
Title: Benchmarking LLMs for Community Governance Simulation with Life-history Narratives
Title（参考訳）: ライフヒストリーナラティブを用いたコミュニティガバナンスシミュレーションのためのLLMのベンチマーク
Authors: Xu Chen, Yuanzi Li, Lei Wang, Nan Lu, Yang Wang, Anding Wang, Lei Shi, Xiaoxing Fu, Ji-Rong Wen,
Abstract要約: 大規模言語モデル(LLM)は、人間の態度や振る舞いを低コストでシミュレートするためにスケーラブルである。本稿では,データセット,ベンチマーク,アルゴリズム,システムを対象とした総合的な研究フレームワークを提案する。システムはカリキュラム-LoRAをクローズドループポリシー評価パイプラインに統合する。
参考スコア（独自算出の注目度）: 46.86050402684712
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Effective community governance hinges on understanding what specific residents think and need. Recent work has used large language models (LLMs) to simulate human respondents, offering a scalable, reproducible way to study human attitudes and behaviors at low cost. However, these studies typically prompt the model with just a few demographic variables (age, gender, income), simulating only general role types. This is insufficient for community governance, where decisions depend on the views of specific residents. We bridge this gap with an integrated research framework covering dataset, benchmark, algorithm, and system. The dataset comprises approximately 1.2 million characters of first-person narrative collected through two-hour semi-structured interviews with each of 92 residents in an urban community, organized around nine community-governance domains. The benchmark probes 18 mainstream LLMs across four prompting strategies and shows that adding rich life-history profiles meaningfully raises fidelity above the no-profile baseline, but this gain comes with more input tokens per call from the longer prompts they require. The algorithm, curriculum-LoRA, is a parameter-efficient personalization framework that, by closing this fidelity-cost gap, matches the strongest baseline's fidelity at roughly 10x lower per-call cost and Pareto-dominates every configuration tested. The system integrates curriculum-LoRA into a closed-loop policy-evaluation pipeline. Together, these results bring individual-level LLM-based resident simulation within reach of resource-constrained local administrations, enabling community-governance decisions to be systematically pre-evaluated in silico before real-world deployment.
Abstract（参考訳）: 効果的なコミュニティガバナンスは、特定の住民が何を考え、必要とするかを理解することに集中します。最近の研究は、人間の反応をシミュレートするために大きな言語モデル(LLM)を使用しており、低コストで人間の態度や行動を研究するスケーラブルで再現可能な方法を提供している。しかしながら、これらの研究は典型的には少数の人口統計学的変数(年齢、性別、収入)でモデルを刺激し、一般的な役割タイプのみをシミュレートする。これは、決定が特定の住民の見解に依存するコミュニティガバナンスには不十分である。このギャップを、データセット、ベンチマーク、アルゴリズム、システムをカバーする統合研究フレームワークで埋める。このデータセットは、都市部住民92人との2時間の半構造化インタビューを通じて収集された1対1の物語の約120万文字からなる。このベンチマークは、4つのプロンプト戦略にまたがる18のメインストリームのLSMを調査し、リッチなライフヒストリープロファイルを追加することで、不明なベースライン以上の忠実度が向上することを示した。このアルゴリズムはパラメータ効率の良いパーソナライズフレームワークであり、このフィデリティコストのギャップを閉じることで、最強のベースラインのフィデリティを約10倍のコストで一致させ、Pareto-dosはテストされたすべての構成を支配している。このシステムはカリキュラム-LoRAをクローズドループポリシー評価パイプラインに統合する。これらの結果と合わせて、資源制限された地方行政の範囲内において、個人レベルのLCMベースの居住シミュレーションが実現し、実際の展開前に、コミュニティのガバナンス決定をシリコで体系的に事前評価することが可能になる。

論文の概要: Benchmarking LLMs for Community Governance Simulation with Life-history Narratives

関連論文リスト