Fugu-MT 論文翻訳(概要): Locating Demographic Bias at the Attention-Head Level in CLIP's Vision Encoder

論文の概要: Locating Demographic Bias at the Attention-Head Level in CLIP's Vision Encoder

arxiv url: http://arxiv.org/abs/2603.11793v1
Date: Thu, 12 Mar 2026 10:54:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-13 14:46:26.027529
Title: Locating Demographic Bias at the Attention-Head Level in CLIP's Vision Encoder
Title（参考訳）: CLIPビジョンエンコーダの注意レベルにおけるデモグラフィックバイアスの配置
Authors: Alaa Yasser, Kittipat Phunjanna, Marcos Escudero Viñolo, Catarina Barata, Jenny Benois-Pineau,
Abstract要約: 本稿では,残差ストリーム分解,ゼロショット概念活性化ベクトル,バイアス増分テキストスパン解析を組み合わせたメカニスティックフェアネス監査を提案する。このパイプラインを、FACETベンチマークの42の専門クラスであるCLIP ViT-L-14エンコーダに適用し、性別と年齢の偏りを検査する。
参考スコア（独自算出の注目度）: 5.240228994459652
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Standard fairness audits of foundation models quantify that a model is biased, but not where inside the network the bias resides. We propose a mechanistic fairness audit that combines projected residual-stream decomposition, zero-shot Concept Activation Vectors, and bias-augmented TextSpan analysis to locate demographic bias at the level of individual attention heads in vision transformers. As a feasibility case study, we apply this pipeline to the CLIP ViT-L-14 encoder on 42 profession classes of the FACET benchmark, auditing both gender and age bias. For gender, the pipeline identifies four terminal-layer heads whose ablation reduces global bias (Cramer's V: 0.381 -> 0.362) while marginally improving accuracy (+0.42%); a layer-matched random control confirms that this effect is specific to the identified heads. A single head in the final layer contributes to the majority of the reduction in the most stereotyped classes, and class-level analysis shows that corrected predictions shift toward the correct occupation. For age, the same pipeline identifies candidate heads, but ablation produces weaker and less consistent effects, suggesting that age bias is encoded more diffusely than gender bias in this model. These results provide preliminary evidence that head-level bias localisation is feasible for discriminative vision encoders and that the degree of localisability may vary across protected attributes. keywords: Bias . CLIP . Mechanistic Interpretability . Vision Transformer . Fairness
Abstract（参考訳）: 基礎モデルの標準的な公正監査は、モデルがバイアスを受けているが、バイアスがネットワーク内に存在する場所ではないことを定量化します。本稿では,残差ストリーム分解,ゼロショット概念アクティベーションベクトル,バイアス増分テキストスパン解析を併用したメカニスティックフェアネス監査手法を提案する。このパイプラインを、FACETベンチマークの42の専門クラスであるCLIP ViT-L-14エンコーダに適用し、性別と年齢の偏りを検査する。性別に関して、パイプラインは、アブレーションがグローバルバイアス(クラマーのV:0.381 ->0.362)を減少させる4つの末端層ヘッドを特定し、精度を極端に向上させる(+0.42%)。最終層の1つの頭部は、最もステレオタイプ化されたクラスの減少の大部分に寄与し、クラスレベルの分析では、修正された予測が正しい職業へと移行することを示している。年齢については、同じパイプラインが候補の頭部を識別するが、アブレーションはより弱く一貫性の低い効果を生じさせ、このモデルでは年齢バイアスが性別バイアスよりも拡散的に符号化されていることを示唆している。これらの結果は、識別的視覚エンコーダにおいて、頭部偏差局所化が実現可能であること、また、局所可能性の程度が保護属性によって異なることの予備的証拠を提供する。キーワード: バイアス。 CLIP。機械的解釈可能性。ビジョン・トランスフォーマー。公正

論文の概要: Locating Demographic Bias at the Attention-Head Level in CLIP's Vision Encoder

関連論文リスト