Fugu-MT 論文翻訳(概要): LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values

論文の概要: LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values

arxiv url: http://arxiv.org/abs/2606.13944v1
Date: Thu, 11 Jun 2026 22:09:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 16:00:42.665726
Title: LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values
Title（参考訳）: マルチタスクを含むLLM: デプロイコンテキストがモデルレベルの前提と価値をどう改善するか
Authors: Filip Trhlik, Aoife O'Flynn, Angela Yu, Arduin Findeis, Paula Buttery,
Abstract要約: 大規模言語モデル(LLM)は、安定的でモデルレベルの嗜好と価値システムを持つものとして、最近の評価作業でますます特徴付けられている。我々は、国家の選好のランク付けとユーティリティ判断の2つの確立されたペアワイズパラダイムを直接テストする。
参考スコア（独自算出の注目度）: 3.52299670434098
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are increasingly characterised in recent evaluation work as having stable, model-level preference and value systems. However, accompanying robustness checks are limited to incidental prompt perturbations such as syntax variation and option reordering. This leaves open whether the measured properties survive when the surrounding task context changes, as it does in most real deployments. We test this directly across two established pairwise paradigms: ranking country preferences and eliciting utility judgements. In both, we make the deployment context -- the high-level task the model is performing while making concrete value-dependent choices -- our controlled variable, varied across framings such as writing a Reddit post or a news article. Across five LLMs and over 1.2M pairwise decisions, deployment context produces variation far larger than prompt paraphrasing and temperature controls. In country preference rankings over 15 countries, context induces widespread, statistically significant rank shifts; the aggregate Global North favouritism reported in prior work is itself context-dependent, with each model's bias shifting systematically across contexts. In utility elicitation over 50 outcomes, broad cross-category ordering is preserved, but fine-grained rankings within domains vary substantially, and cardinal exchange rates between outcomes (e.g. how many lives in one region equal one in another) shift by a factor of 2.47 at the median. Reported model-level preferences and utilities are therefore better understood as context-conditioned measurements than fixed model-level properties: safety guarantees obtained under one framing provide limited assurance in another.
Abstract（参考訳）: 大規模言語モデル(LLM)は、安定的でモデルレベルの嗜好と価値システムを持つものとして、最近の評価作業でますます特徴付けられている。しかし、それに伴うロバスト性チェックは、構文の変化やオプションの並べ替えといった、偶発的な急激な摂動に限られる。これにより、ほとんどの実際のデプロイで行われているように、周囲のタスクコンテキストが変化したときに測定済みのプロパティが存続するかどうかが明らかになる。我々は、国家の選好のランク付けとユーティリティ判断の2つの確立されたペアワイズパラダイムを直接テストする。両方に、デプロイコンテキスト -- モデルが実行している高レベルなタスク -- を、具体的な値依存の選択 -- にします。 5つの LLM と 1.2M 以上のペアによる決定により、配置コンテキストは、迅速なパラフレージングや温度制御よりもはるかに大きく変化する。 15か国以上の国別選好ランキングでは、文脈は広く統計的に有意なランクシフトを誘導し、以前の研究で報告されたグローバル・ノースの選好は文脈依存であり、各モデルのバイアスは文脈全体にわたって体系的にシフトする。 50以上の効用エスカレーションでは、幅広いカテゴリの順序付けが保存されるが、ドメイン内のきめ細かいランク付けは著しく異なり、結果(例えば、ある領域の寿命が1つに等しい)間の基数交換レートは中央値2.47にシフトする。報告されたモデルレベルの嗜好とユーティリティは、固定されたモデルレベルの特性よりも文脈条件による測定として理解されている。

論文の概要: LLMs Contain Multitudes: How Deployment Context Reshapes Model-Level Preferences and Values

関連論文リスト