Fugu-MT 論文翻訳(概要): Prompt Engineering for Scale Development in Generative Psychometrics

論文の概要: Prompt Engineering for Scale Development in Generative Psychometrics

arxiv url: http://arxiv.org/abs/2603.15909v1
Date: Mon, 16 Mar 2026 20:55:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:06.98532
Title: Prompt Engineering for Scale Development in Generative Psychometrics
Title（参考訳）: ジェネレーティブ心理学における尺度開発のためのプロンプト工学
Authors: Lara Lee Russell-Lasalandra, Hudson Golino,
Abstract要約: このモンテカルロシミュレーションは,工学的戦略が大規模言語モデル(LLM)の品質をどのように形成するかを考察する。ビッグファイブの特徴をターゲットとしたアイテムプールは、複数のプロンプト設計を使用して生成される。プロンプトの設計は、プレ・アンド・ポストのアイテムの品質に大きな影響を与えた。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This Monte Carlo simulation examines how prompt engineering strategies shape the quality of large language model (LLM)--generated personality assessment items within the AI-GENIE framework for generative psychometrics. Item pools targeting the Big Five traits were generated using multiple prompting designs (zero-shot, few-shot, persona-based, and adaptive), model temperatures, and LLMs, then evaluated and reduced using network psychometric methods. Across all conditions, AI-GENIE reliably improved structural validity following reduction, with the magnitude of its incremental contribution inversely related to the quality of the incoming item pool. Prompt design exerted a substantial influence on both pre- and post-reduction item quality. Adaptive prompting consistently outperformed non-adaptive strategies by sharply reducing semantic redundancy, elevating pre-reduction structural validity, and preserving substantially larger item pool, particularly when paired with newer, higher-capacity models. These gains were robust across temperature settings for most models, indicating that adaptive prompting mitigates common trade-offs between creativity and psychometric coherence. An exception was observed for the GPT-4o model at high temperatures, suggesting model-specific sensitivity to adaptive constraints at elevated stochasticity. Overall, the findings demonstrate that adaptive prompting is the strongest approach in this context, and that its benefits scale with model capability, motivating continued investigation of model--prompt interactions in generative psychometric pipelines.
Abstract（参考訳）: このモンテカルロシミュレーションは、生成心理学のためのAI-GENIEフレームワークにおいて、工学的戦略が大規模言語モデル(LLM)の生成する人格評価項目の質をどのように形成するかを考察する。ビッグファイブの特徴をターゲットとしたアイテムプールは、複数のプロンプト設計(ゼロショット、少数ショット、ペルソナベース、適応)、モデル温度、LCMを用いて生成され、ネットワーク心理測定法を用いて評価・縮小された。あらゆる条件において、AI-GENIEは、そのインクリメンタルコントリビューションの規模が、入力アイテムプールの品質に逆らうように、削減後の構造的妥当性を確実に改善した。プロンプトの設計は、プレ・アンド・ポストのアイテムの品質に大きな影響を与えた。適応的プロンプトは、セマンティック冗長性を著しく低減し、事前還元構造の有効性を高め、特に新しい高容量モデルと組み合わせた場合、かなり大きなアイテムプールを保存することで、一貫して非適応的戦略より優れている。これらの利得は、ほとんどのモデルで温度設定において堅牢であり、適応的刺激が創造性と心理学的コヒーレンスの間の共通のトレードオフを緩和することを示している。高温ではGPT-4oモデルに例外が認められ, 高い確率性では適応性制約に対するモデル特異的感受性が示唆された。全体として、適応的プロンプトは、この文脈で最強のアプローチであり、その利点がモデル能力とスケールすることを示し、生成的心理測定パイプラインにおけるモデル-プロンプト相互作用の継続的な研究を動機付けている。

論文の概要: Prompt Engineering for Scale Development in Generative Psychometrics

関連論文リスト