Fugu-MT 論文翻訳(概要): Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures

論文の概要: Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures

arxiv url: http://arxiv.org/abs/2603.17952v1
Date: Wed, 18 Mar 2026 17:26:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.850677
Title: Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures
Title（参考訳）: 機械翻訳における性差:デコーダオンリーアーキテクチャにおける診断評価
Authors: Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frédéric Blain, Eva Vanmassenhove,
Abstract要約: 我々は,モデルが既定の性別を仮定した「Prior Bias」という新しい尺度を導入する。スケールと最先端にもかかわらず、デコーダのみのモデルが一般にエンコーダ・デコーダのアーキテクチャを性特化メトリクスで上回っているわけではないことを示す。
参考スコア（独自算出の注目度）: 5.038281415151629
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Large Language Models achieve state-of-the-art results across a wide range of NLP tasks, they remain prone to systematic biases. Among these, gender bias is particularly salient in MT, due to systematic differences across languages in whether and how gender is marked. As a result, translation often requires disambiguating implicit source signals into explicit gender-marked forms. In this context, standard benchmarks may capture broad disparities but fail to reflect the full complexity of gender bias in modern MT. In this paper, we extend recent frameworks on bias evaluation by: (i) introducing a novel measure coined "Prior Bias", capturing a model's default gender assumptions, and (ii) applying the framework to decoder-only MT models. Our results show that, despite their scale and state-of-the-art status, decoder-only models do not generally outperform encoder-decoder architectures on gender-specific metrics; however, post-training (e.g., instruction tuning) not only improves contextual awareness but also reduces the masculine Prior Bias.
Abstract（参考訳）: 大規模言語モデルは、幅広いNLPタスクにまたがって最先端の結果をもたらすが、体系的なバイアスが伴う傾向にある。これらのうち、ジェンダーバイアスはMTにおいて特に顕著であり、ジェンダーがマークされているか、どのようにマークされているかという点で言語間で体系的に異なるためである。結果として、翻訳はしばしば、暗黙のソースシグナルを明示的なジェンダーマークの形式に曖昧にする必要がある。この文脈では、標準ベンチマークは幅広い格差を捉えることができるが、現代MTにおける男女バイアスの完全複雑さを反映することができない。一モデルが既定の性別を仮定した「プリオールバイアス」という新しい尺度を導入し、 (2)デコーダのみのMTモデルにフレームワークを適用する。以上の結果から,デコーダのみのモデルは,大規模かつ最先端であるにもかかわらず,性特化指標に基づくエンコーダ・デコーダのアーキテクチャを概ね上回るものではないことが示唆された。

関連論文リスト

Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation [4.881426374773398]
最小ペア精度(MPA)と呼ばれる新しい評価指標を提案する。 MPAは、モデルが最小ペアで性別に適応するかどうかに焦点を当てている。 MPAは、反ステレオタイプの場合、NMTモデルは男性的な性別の手がかりを考慮に入れやすいことを示している。
論文参考訳（メタデータ） (2025-05-13T13:17:23Z)
Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms [12.568906647547815]
GRAPEは、性別バイアスを評価するための確率ベースの指標である。 GAMBITは、性別のあいまいな職業用語を持つ英語のベンチマークデータセットである。 GRAPEを用いて、いくつかのMTシステムを評価し、ギリシャ語とフランス語の性別による翻訳が社会的ステレオタイプと一致しているか、あるいは相違するかを検討する。
論文参考訳（メタデータ） (2025-03-06T12:16:14Z)
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
大規模言語モデル(LLM)は、自然言語生成において顕著な能力を示してきたが、社会的バイアスを増大させることも観察されている。 GenderCAREは、革新的な基準、バイアス評価、リダクションテクニック、評価メトリクスを含む包括的なフレームワークである。
論文参考訳（メタデータ） (2024-08-22T15:35:46Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
既存の機械翻訳の性別バイアス評価は主に男性と女性の性別に焦点を当てている。本研究では,AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) のベンチマークを示す。本研究では,感情的態度スコア(EAS)に基づく性別バイアス評価手法を提案する。
論文参考訳（メタデータ） (2024-07-23T08:13:51Z)
A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation [35.44115368160656]
機械翻訳モデルがジェンダーバイアスを示すか否かについて検討する。 We found that IFT model default to male-inflected translations, evengarding female occupational stereotypes。実装が容易で効果的なバイアス緩和ソリューションを提案する。
論文参考訳（メタデータ） (2023-10-18T17:36:55Z)
Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns [53.62845317039185]
バイアス測定データセットは、言語モデルのバイアスされた振る舞いを検出する上で重要な役割を果たす。本稿では, 多様な, 自然な, 最小限のテキストペアを, 対物生成によって収集する新しい手法を提案する。事前学習された4つの言語モデルは、各グループ内よりも、異なる性別グループ間でかなり不整合であることを示す。
論文参考訳（メタデータ） (2023-02-11T12:11:03Z)
Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by Ranking Algorithms [68.85295025020942]
本稿では,性別ステレオタイプをサポートする検索エンジンの傾向を定量化するジェンダーステレオタイプ強化(GSR)尺度を提案する。 GSRは、表現上の害を定量化できる情報検索のための、最初の特別に調整された尺度である。
論文参考訳（メタデータ） (2020-09-02T20:45:04Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
機械学習モデルは、性別に偏ったテキストでトレーニングする際に、社会的に望ましくないパターンを不注意に学習することができる。本稿では,テキスト中の性バイアスを複数の実用的・意味的な次元に沿って分解する一般的な枠組みを提案する。このきめ細かいフレームワークを用いて、8つの大規模データセットにジェンダー情報を自動的にアノテートする。
論文参考訳（メタデータ） (2020-05-01T21:23:20Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。