Fugu-MT 論文翻訳(概要): Evidence-based Distributional Alignment for Large Language Models

論文の概要: Evidence-based Distributional Alignment for Large Language Models

arxiv url: http://arxiv.org/abs/2603.13305v1
Date: Tue, 03 Mar 2026 03:34:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:42.287363
Title: Evidence-based Distributional Alignment for Large Language Models
Title（参考訳）: 大規模言語モデルのためのエビデンスに基づく分布アライメント
Authors: Viet-Thanh Pham, Lizhen Qu, Zhuang Li, Gholamreza Haffari,
Abstract要約: LLM分布推定の忠実度とロバスト性を改善する証拠に基づくアライメント手法であるEvi-DAを提案する。対象国が与えられた場合、Evi-DAは関連するWorld Values Survey項目とその回答分布を検索し、オプション毎に粗いヴェルツェル値シグネチャを予測し、国条件の回答分布を構造化形式で推測する。
参考スコア（独自算出の注目度）: 58.65469623911573
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Distributional alignment enables large language models (LLMs) to predict how a target population distributes its responses across answer options, rather than collapsing disagreement into a single consensus answer. However, existing LLM-based distribution prediction is often unstable and degrades under cultural and domain shift. Token score-based estimates can change with minor option wording or formatting, response sampling-based estimates are expensive and sensitive to prompts and decoding settings, and directly generated distributions are frequently miscalibrated. We propose Evi-DA, an evidence-based alignment technique that improves the fidelity and robustness of LLM-based distribution estimation under domain and cultural shift. Given a target country and a multiple-choice question, Evi-DA retrieves related World Values Survey items and their answer distributions, predicts a coarse Welzel value signature for each option, and infers the country-conditioned answer distribution in a structured format. We train the LLMs using a two-stage pipeline, where reinforcement learning optimizes survey-derived rewards that encourage accurate intermediate value predictions, faithful final distributions, well-formed structured outputs, and reduced cultural bias. Across in-domain and out-of-domain benchmarks and multiple open-source backbones, Evi-DA reduces Jensen-Shannon divergence between predicted and gold distributions relative to strong baselines, with average relative improvements of up to 44%.
Abstract（参考訳）: 分散アライメントにより、大きな言語モデル(LLM)は、単一のコンセンサス回答に不一致を分解するのではなく、ターゲットの集団が回答オプションにまたがる応答をどのように分散するかを予測することができる。しかし、既存のLLMベースの分布予測は、しばしば不安定であり、文化やドメインのシフトによって劣化する。トークンスコアベースの推定は、マイナーオプションのワードやフォーマッティングによって変化し、レスポンスサンプリングベースの推定は、プロンプトやデコード設定に対して高価で敏感であり、直接生成された分布は、しばしば誤解される。 Evi-DAは,LLMによる分布推定の忠実度とロバスト性を,ドメインや文化の変化下で向上するエビデンスベースのアライメント手法である。対象国と複数選択質問が与えられたEvi-DAは、関連するWorld Values Survey項目とその回答分布を検索し、オプション毎に粗いヴェルツェル値シグネチャを予測し、国条件の回答分布を構造化形式で推測する。我々は2段階のパイプラインを用いてLLMを訓練し、強化学習は調査から得られる報酬を最適化し、正確な中間値予測、忠実な最終分布、十分に構成された出力、文化的バイアスを低減します。ドメイン内ベンチマークやドメイン外ベンチマーク、複数のオープンソースバックボーンを通じて、Evi-DAはJensen-Shannonの強いベースラインに対する予測と金分布のばらつきを減らし、平均的な相対的な改善は44%に達する。

論文の概要: Evidence-based Distributional Alignment for Large Language Models

関連論文リスト