Fugu-MT 論文翻訳(概要): Unmasking the Mask -- Evaluating Social Biases in Masked Language Models

論文の概要: Unmasking the Mask -- Evaluating Social Biases in Masked Language Models

arxiv url: http://arxiv.org/abs/2104.07496v1
Date: Thu, 15 Apr 2021 14:40:42 GMT
ステータス: 翻訳完了
システム内更新日: 2021-04-16 21:49:53.040128
Title: Unmasking the Mask -- Evaluating Social Biases in Masked Language Models
Title（参考訳）: unmasking the mask -- マスキング言語モデルにおける社会的バイアスの評価
Authors: Masahiro Kaneko and Danushka Bollegala
Abstract要約: Masked Language Models(MLM)は、テキストエンコーダとして使用すると、多数の下流NLPタスクで優れたパフォーマンスを発揮します。テストケースにおける全てのトークンを予測するバイアス評価尺度であるAll Unmasked Likelihood (AUL)を提案する。また,注意重み付きALU(AULA)を文中のトークンの重要性に基づいて評価する手法を提案する。
参考スコア（独自算出の注目度）: 28.378270372391498
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Masked Language Models (MLMs) have shown superior performances in numerous downstream NLP tasks when used as text encoders. Unfortunately, MLMs also demonstrate significantly worrying levels of social biases. We show that the previously proposed evaluation metrics for quantifying the social biases in MLMs are problematic due to following reasons: (1) prediction accuracy of the masked tokens itself tend to be low in some MLMs, which raises questions regarding the reliability of the evaluation metrics that use the (pseudo) likelihood of the predicted tokens, and (2) the correlation between the prediction accuracy of the mask and the performance in downstream NLP tasks is not taken into consideration, and (3) high frequency words in the training data are masked more often, introducing noise due to this selection bias in the test cases. To overcome the above-mentioned disfluencies, we propose All Unmasked Likelihood (AUL), a bias evaluation measure that predicts all tokens in a test case given the MLM embedding of the unmasked input. We find that AUL accurately detects different types of biases in MLMs. We also propose AUL with attention weights (AULA) to evaluate tokens based on their importance in a sentence. However, unlike AUL and AULA, previously proposed bias evaluation measures for MLMs systematically overestimate the measured biases, and are heavily influenced by the unmasked tokens in the context.
Abstract（参考訳）: Masked Language Models (MLM) は、テキストエンコーダとして使われる多くの下流のNLPタスクにおいて、優れたパフォーマンスを示している。残念ながら、MLMは社会的偏見のレベルを著しく心配していることも示している。 We show that the previously proposed evaluation metrics for quantifying the social biases in MLMs are problematic due to following reasons: (1) prediction accuracy of the masked tokens itself tend to be low in some MLMs, which raises questions regarding the reliability of the evaluation metrics that use the (pseudo) likelihood of the predicted tokens, and (2) the correlation between the prediction accuracy of the mask and the performance in downstream NLP tasks is not taken into consideration, and (3) high frequency words in the training data are masked more often, introducing noise due to this selection bias in the test cases. 上記の不整合を克服するために,mlmが入力を埋め込みた場合に,テストケース内のすべてのトークンを予測するバイアス評価尺度であるall unmasked likelihood (aul)を提案する。 AULはMLMの異なる種類のバイアスを正確に検出する。また,注意重み付きALU(AULA)を文中のトークンの重要性に基づいて評価する手法を提案する。しかし、AULやAULAと異なり、以前提案されたMLMのバイアス評価尺度は、測定されたバイアスを体系的に過大評価し、文脈における不正トークンの影響を強く受けている。

関連論文リスト

Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks [50.53590930588431]
敵対的な例は自然言語処理システムに深刻な脅威をもたらします近年の研究では、対角的テキストは通常のテキストの多様体から逸脱していることが示唆されているが、マスク付き言語モデルは正規データの多様体を近似することができる。まず、マスク付き言語モデリング(MLM)の目的のマスクアンマスク操作を活用するMLMD(Masked Language Model-based Detection)を導入する。
論文参考訳（メタデータ） (2025-04-08T14:10:57Z)
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge [84.34545223897578]
多くの領域で優れているにもかかわらず、潜在的な問題は未解決のままであり、その信頼性と実用性の範囲を損なう。提案手法は, LLM-as-a-Judgeにおける各種類のバイアスを定量化し, 解析する自動バイアス定量化フレームワークである。当社の作業は、これらの問題に対処するステークホルダの必要性を強調し、LLM-as-a-Judgeアプリケーションで注意を喚起します。
論文参考訳（メタデータ） (2024-10-03T17:53:30Z)
MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators [53.91199933655421]
大規模言語モデル(LLM)は、機械翻訳(MT)の品質評価の裁判官として大きな可能性を秘めている。 LLM評価器によって予測されるエラーアノテーションの品質を高めるために,ユニバーサルでトレーニング不要なフレームワークである$textbfMQM-APEを導入する。
論文参考訳（メタデータ） (2024-09-22T06:43:40Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
大規模言語モデル(LLM)は、様々なベンチマークで人間レベルの精度に到達し、さらに超えることができるが、不正確な応答における過度な自信は、依然として十分に文書化された障害モードである。本稿では,LLMの不確実性を測定するためのフレームワークを提案する。
論文参考訳（メタデータ） (2024-06-05T16:35:30Z)
Towards Probabilistically-Sound Beam Search with Masked Language Models [0.0]
ビームサーチマスキング言語モデル(MLM)は,分布上の結合確率が得られないため,部分的には困難である。このような分布を推定することは、古代のテキスト復元やタンパク質工学といったドメイン固有の重要な応用がある。ここでは,系列を用いたビームサーチの確率論的手法を提案する。
論文参考訳（メタデータ） (2024-02-22T23:36:26Z)
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality [0.0]
社会政治学者は、しばしばテキストデータ表現(埋め込み)とは異なるバイアスを発見し、測定することを目的としている。本稿では,マスク付き言語モデルを用いて学習したトランスフォーマーによって符号化された社会的バイアスを評価する。提案手法により,提案手法により, 変圧器間の偏りのある文の相対的嗜好を, より正確に推定できることがわかった。
論文参考訳（メタデータ） (2024-02-21T17:33:13Z)
Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code? [51.29970742152668]
精度に基づく測定に依存することで、モデルの能力が過大評価される可能性があることを強調する。これらの問題に対処するために,SyntaxEval in Syntactic Capabilitiesというテクニックを導入する。
論文参考訳（メタデータ） (2024-01-03T02:44:02Z)
Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
オープンエンド生成タスクをトークンレベルの予測タスクに再構成する。我々はLSMに答えを自己評価するように指示する。自己評価に基づくスコアリング手法をベンチマークする。
論文参考訳（メタデータ） (2023-12-14T19:09:22Z)
Constructing Holistic Measures for Social Biases in Masked Language Models [17.45153670825904]
Masked Language Models (MLM)は多くの自然言語処理タスクで成功している。現実世界のステレオタイプバイアスは、大きなテキストコーパスから学んだことから、インスパイアされる可能性が高い。 Kullback Leiblergence Score (KLDivS) とJensen Shannon Divergence Score (JSDivS) の2つの評価指標を提案し,社会バイアスの評価を行った。
論文参考訳（メタデータ） (2023-05-12T23:09:06Z)
Inconsistencies in Masked Language Models [20.320583166619528]
Masked Language Model (MLM) は、マスキングされた位置におけるトークンの分布をシーケンスで提供することができる。異なるマスキングパターンに対応する分布は、かなりの矛盾を示す可能性がある。本稿では,条件文の集合(Ensemble of Conditionals)と呼ばれる fors の推論時間戦略を提案する。
論文参考訳（メタデータ） (2022-12-30T22:53:25Z)
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks [33.044775876807826]
仮面言語モデル(MLM)におけるタスク非依存とタスク固有の社会的偏見評価の内在的関係について検討する。この2つの評価尺度の間には弱い相関しか存在しないことが判明した。
論文参考訳（メタデータ） (2022-10-06T14:08:57Z)
Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model [57.77981008219654]
Masked Language Model (MLM)フレームワークは、自己教師型言語事前学習に広く採用されている。そこで本研究では,テキストシーケンスを複数の非重複セグメントに分割するマスキング手法を提案する。
論文参考訳（メタデータ） (2020-10-12T21:28:14Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。