Fugu-MT 論文翻訳(概要): FairQE: Multi-Agent Framework for Mitigating Gender Bias in Translation Quality Estimation

論文の概要: FairQE: Multi-Agent Framework for Mitigating Gender Bias in Translation Quality Estimation

arxiv url: http://arxiv.org/abs/2604.21420v1
Date: Thu, 23 Apr 2026 08:35:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-24 14:40:06.38729
Title: FairQE: Multi-Agent Framework for Mitigating Gender Bias in Translation Quality Estimation
Title（参考訳）: FairQE: 翻訳品質推定におけるジェンダーバイアスの軽減のためのマルチエージェントフレームワーク
Authors: Jinhee Jang, Juhwan Choi, Dongjin Lee, Seunguk Yu, Youngbin Kim,
Abstract要約: 品質評価は、参照翻訳なしで機械翻訳の品質を評価することを目的としている。近年の研究では、既存のQEモデルは体系的な性別バイアスを示すことが示されている。ジェンダー・アンビグラス・ジェンダー・エクスプリシットの両方のシナリオにおいて、ジェンダーバイアスを緩和するフェアネスを意識したQEフレームワークであるFairQEを提案する。
参考スコア（独自算出の注目度）: 30.26650508806054
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quality Estimation (QE) aims to assess machine translation quality without reference translations, but recent studies have shown that existing QE models exhibit systematic gender bias. In particular, they tend to favor masculine realizations in gender-ambiguous contexts and may assign higher scores to gender-misaligned translations even when gender is explicitly specified. To address these issues, we propose FairQE, a multi-agent-based, fairness-aware QE framework that mitigates gender bias in both gender-ambiguous and gender-explicit scenarios. FairQE detects gender cues, generates gender-flipped translation variants, and combines conventional QE scores with LLM-based bias-mitigating reasoning through a dynamic bias-aware aggregation mechanism. This design preserves the strengths of existing QE models while calibrating their gender-related biases in a plug-and-play manner. Extensive experiments across multiple gender bias evaluation settings demonstrate that FairQE consistently improves gender fairness over strong QE baselines. Moreover, under MQM-based meta-evaluation following the WMT 2023 Metrics Shared Task, FairQE achieves competitive or improved general QE performance. These results show that gender bias in QE can be effectively mitigated without sacrificing evaluation accuracy, enabling fairer and more reliable translation evaluation.
Abstract（参考訳）: 品質評価(QE)は、参照翻訳なしで機械翻訳の品質を評価することを目的としているが、近年の研究では、既存のQEモデルが体系的な性別バイアスを示すことが示されている。特に、性別のあいまいな文脈における男性的な実現を好んでおり、性別が明示的に特定された場合でも、ジェンダーミスの翻訳により高いスコアを割り当てる傾向にある。このような問題に対処するために、FairQEというマルチエージェントベースのフェアネス対応QEフレームワークを提案する。 FairQEはジェンダークイーズを検出し、ジェンダーフリップされた翻訳変種を生成し、従来のQEスコアとLLMに基づくバイアス緩和推論を動的バイアス認識アグリゲーション機構で組み合わせる。この設計は、既存のQEモデルの強みを保ちつつ、プラグアンドプレイ方式で性別関連バイアスを調整している。複数の性別バイアス評価設定の広範な実験は、FairQEが強いQEベースラインよりも男女公正性を一貫して改善していることを示している。さらに、WMT 2023 Metrics Shared Taskに続くMQMベースのメタ評価では、FairQEは、競争力や改善された一般的なQEパフォーマンスを達成する。これらの結果から,QEにおける性別バイアスは,評価精度を犠牲にすることなく効果的に緩和でき,より公平で信頼性の高い翻訳評価が可能であることが示唆された。

論文の概要: FairQE: Multi-Agent Framework for Mitigating Gender Bias in Translation Quality Estimation

関連論文リスト