Fugu-MT 論文翻訳(概要): Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?

論文の概要: Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?

arxiv url: http://arxiv.org/abs/2605.16996v1
Date: Sat, 16 May 2026 13:44:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:47.411005
Title: Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?
Title（参考訳）: LLMパーソナリティ誘導における評価ドリフト:私たちはゴールポストを移動しているのか?
Authors: Prateek Rajput, Yewei Song, Iyiola E. Olatunji, Jacques Klein, Tegawendé F. Bissyandé,
Abstract要約: 我々は、長文のエッセイでそれらを微調整することで、大きな言語モデルにおけるパーソナリティを誘導する。そして,IPIP-NEOアンケートを用いて,誘導的性格の安定性と忠実度を評価する。
参考スコア（独自算出の注目度）: 11.462572308067033
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Can large language models reliably express a human-like personality, or are they merely mimicking surface cues without a stable underlying profile? To investigate this, we induce personality in LLMs by fine-tuning them on the long-form essays, where each essay is associated with a target Big Five personality profile. We then evaluate the stability and fidelity of the induced personality using the IPIP-NEO questionnaire. Specifically, we ask: (i) does post-training (SFT, DPO, ORPO) stabilize questionnaire scores under prompt rephrasings, and (ii) can it induce target Big Five profiles from unguided essays? Our results demonstrate that fine-tuning consistently reduces variance in questionnaire responses across five models, directly mitigating the evaluation fragility reported in pre-trained models. However, this newfound stability reveals a more fundamental limitation: accuracy on the full five-dimensional profile remains near chance, even when single-trait scores improve. This indicates that unguided essays lack the cues needed for faithful personality expression. We therefore argue for scenario-grounded datasets or interactive elicitation that accumulates test-aligned evidence over time.
Abstract（参考訳）: 大きな言語モデルは、人間のような性格を確実に表現できるだろうか? そこで本研究では,長文エッセイを微調整し,それぞれのエッセイを対象とするビッグファイブのパーソナリティプロファイルに関連付けることにより,LLMのパーソナリティを誘導する。そして,IPIP-NEOアンケートを用いて,誘導的性格の安定性と忠実度を評価する。具体的にはこう尋ねます i) 素早い言い直しによる質問票の安定学習(SFT, DPO, ORPO)を行い, (ii)未発表のエッセイからビッグファイブのプロフィールを誘導できるか? その結果, 微調整は5つのモデル間での回答のばらつきを連続的に低減し, 事前学習モデルで報告される評価脆弱性を直接緩和することを示した。しかし、この新たな安定性により、より基本的な制限が示される: 完全な5次元プロファイルの精度は、シングルトレイのスコアが改善しても、ほぼ確実である。これは、無指導のエッセイが忠実な人格表現に必要な手がかりを欠いていることを示している。したがって、シナリオグラウンドのデータセットや、テスト整合性の証拠を時間をかけて蓄積するインタラクティブな推論について論じる。

関連論文リスト

Can LLMs Infer Personality from Real World Conversations? [5.705775078773656]
大規模言語モデル(LLM)は、オープンエンド言語からのスケーラブルなパーソナリティアセスメントに対して、有望なアプローチを提供する。 BFI-10項目予測のためのゼロショットプロンプトと、ビッグファイブ特性推定のためのゼロショットとチェーン・オブ・シートの両方を用いて、最先端の3つのLSMを試験した。全てのモデルでは高い信頼性を示したが、構成の妥当性は限られていた。
論文参考訳（メタデータ） (2025-07-18T20:22:47Z)
Rediscovering the Latent Dimensions of Personality with Large Language Models as Trait Descriptors [4.814107439144414]
大規模言語モデル(LLM)における潜在人格次元を明らかにする新しいアプローチを提案する。実験の結果, LLMは, 直接アンケート入力に頼ることなく, 外転, 同意性, 良性, 神経性, 開放性などの中核的性格を「発見」することがわかった。抽出した主成分を用いて、ビッグファイブ次元に沿ったパーソナリティを評価し、微調整モデルよりも平均的なパーソナリティ予測精度を最大5%向上させることができる。
論文参考訳（メタデータ） (2024-09-16T00:24:40Z)
PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for Personality Detection [50.66968526809069]
PsyCoTと呼ばれる新しい人格検出手法を提案する。これは、個人がマルチターン対話方式で心理的質問を完遂する方法を模倣するものである。実験の結果,PsyCoTは人格検出におけるGPT-3.5の性能とロバスト性を大幅に向上させることがわかった。
論文参考訳（メタデータ） (2023-10-31T08:23:33Z)
Editing Personality for Large Language Models [73.59001811199823]
本稿では,Large Language Models (LLMs) の性格特性の編集に焦点をあてた革新的なタスクを紹介する。このタスクに対処する新しいベンチマークデータセットであるPersonalityEditを構築します。
論文参考訳（メタデータ） (2023-10-03T16:02:36Z)
Revisiting the Reliability of Psychological Scales on Large Language Models [62.57981196992073]
本研究の目的は,大規模言語モデルにパーソナリティアセスメントを適用することの信頼性を明らかにすることである。 GPT-3.5、GPT-4、Gemini-Pro、LLaMA-3.1などのモデル毎の2,500設定の分析により、様々なLCMがビッグファイブインベントリに応答して一貫性を示すことが明らかになった。
論文参考訳（メタデータ） (2023-05-31T15:03:28Z)
PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits [30.770525830385637]
本研究では,ビッグファイブ・パーソナリティ・モデルに基づく大規模言語モデル(LLM)の行動について検討する。その結果, LLMペルソナの自己申告したBFIスコアは, 指定した性格タイプと一致していることがわかった。人間の評価は、人間は最大80%の精度でいくつかの性格特性を知覚できることを示している。
論文参考訳（メタデータ） (2023-05-04T04:58:00Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。