Fugu-MT 論文翻訳(概要): Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach

論文の概要: Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach

arxiv url: http://arxiv.org/abs/2605.12177v1
Date: Tue, 12 May 2026 14:22:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.917602
Title: Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach
Title（参考訳）: 大言語モデル品質推定のためのスパースユーザフィードバックにおける選択バイアスの補正:多階層的ベイズ的アプローチ
Authors: Andrea Morandi, Mahesh Viswanathan,
Abstract要約: 本研究では, 個別の相互作用に対して, 地絡ラベルを必要としない3次元階層型ベイズパイプラインを提案する。フィードバックチャネル(典型的には正のフィードバック率と負のフィードバック比)の軽度先行は、バイアス比が一掃されるにつれて階層的インフォームドは4-13 pp of $Qstar$ に留まる。チャネル側の先行がなければ、すべての弱いプライオリティは、22-33 pp.で$Qstar$を逃す。
参考スコア（独自算出の注目度）: 0.9558392439655014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: [Abridged] Production LLM deployments receive feedback from a non-random fraction of users: thumbs sit mostly in the tails of the satisfaction distribution, and a naive average over them can land 40-50 percentage points away from true system quality. We treat this as a topic- and sentiment- stratified selection-bias problem and propose a three-agent hierarchical Bayesian pipeline that does not require ground-truth labels on individual interactions. A Topic Clustering Agent partitions the stream via UMAP + HDBSCAN over text embeddings; a Bias Modeling Agent fits a two-stage hierarchical Beta-Binomial under NUTS, inferring per-topic selection rates $s_c$ and quality $q_c$ with partial pooling; a Synthesis Agent reweights $q_c$ by true topic prevalence $\hatπ_c = n_c/N$ to report a bias-corrected aggregate posterior $\bar Q = \sum_c \hatπ_c q_c$ with credible interval, plus drift signals for online recalibration. Validation uses UltraFeedback (N=10,232 retained interactions, $C=18$ clusters, $Q^\star=0.6249$) with simulated topic- and sentiment-dependent selection biases. We compare five Bayesian variants against Naive and IPW baselines. A mild prior on the feedback channel (typical positive-feedback rate and negative-to-positive ratio, both readable from any production dashboard without labels) keeps Hierarchical-Informed within 4-13 pp of $Q^\star$ as the bias ratio sweeps from 1:1 to 30:1, with 95% credible intervals covering $Q^\star$ in 50/50 random-seed replicates at $κ_{\max}=10$. Without channel-side priors, every weak-prior variant misses $Q^\star$ by 22-33 pp: the per-cluster sufficient statistics admit a one-parameter family of equally good fits, and the prior on the bias channel (not on latent quality) is what breaks the degeneracy.
Abstract（参考訳）: [Abridged]生産LLMデプロイメントは、非ランダムな少数のユーザからフィードバックを受けています。親指は、満足度分布の尾部に位置しており、その上、単純な平均は、真のシステム品質から40～50ポイント離れることができるのです。本稿では,これを話題・感情階層化選択バイアス問題として扱うとともに,個別の相互作用に基幹トラストラベルを必要としない3次元階層型ベイズパイプラインを提案する。トピッククラスタリングエージェントは、テキスト埋め込みを介してUMAP + HDBSCANを介してストリームを分割する; バイアスモデリングエージェントは、NUTSの下で2段階の階層的なベータ・バイノミカルに適合し、トピックごとの選択レート$s_c$と品質$q_c$をパーシャルプールで推論する; 合成エージェントは、真のトピックプレバレンス$\hatπ_c = n_c/N$で$q_c$を再重み付けし、バイアス補正された集約されたアグリゲート$\bar Q = \sum_c \hatπ_cq_c$と、オンラインリカバリのためのドリフトシグナルをレポートする。検証にはUltraFeedback(N=10,232の残留相互作用、$C=18$クラスタ、$Q^\star=0.6249$)とシミュレートされたトピックと感情に依存した選択バイアスを使用する。ベイズ変種5種を,NaiveおよびIPWベースラインと比較した。フィードバックチャネル(通常、正フィードバック率と負-正比、どちらもラベルのないプロダクションダッシュボードから読み取ることができる)では、階層-インフォームドは4-13pp of $Q^\star$で、バイアス比は1:1から30:1で、95%の信頼区間が$Q^\star$を50/50のランダムシード複製で50/50のκ_{\max}=10$でカバーしている。チャネル側の事前がなければ、全ての弱い優先順位の変種は$Q^\star$ by 22-33 pp: クラスタごとの十分な統計値に等しく適合する1パラメータの族を認め、バイアスチャネルの先行(潜時品質に依存しない)は縮退を損なう。

論文の概要: Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach

関連論文リスト