Fugu-MT 論文翻訳(概要): Is Fairness Truly Fair? Towards Reliable Lipschitz Fairness in Multi-Task Learning via Fixed-\texorpdfstring{$δ$}{delta} Alignment

論文の概要: Is Fairness Truly Fair? Towards Reliable Lipschitz Fairness in Multi-Task Learning via Fixed-\texorpdfstring{$δ$}{delta} Alignment

arxiv url: http://arxiv.org/abs/2606.10632v1
Date: Tue, 09 Jun 2026 09:36:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.422772
Title: Is Fairness Truly Fair? Towards Reliable Lipschitz Fairness in Multi-Task Learning via Fixed-\texorpdfstring{$δ$}{delta} Alignment
Title（参考訳）: フェアネスは真に公平か? 固定された\texorpdfstring{$δ$}{delta}アライメントによるマルチタスク学習における信頼性の高いリプシッツフェアネスに向けて
Authors: Junbo Ding, Xin Zang, Chenchen Pan, Donghao Song, Jiaxin Zhu, Danhuai Guo,
Abstract要約: リプシッツ様式の個人公正は、意味論的に類似した例が同様の予測を受けるべきだという考えを定式化する。本稿では,各モデルの表現距離から監査耐性を導出した場合,異なるセマンティックしきい値の下で異なるアルゴリズムを比較する。トレーニング時間制御正規化から評価時間固定値の監査を分離する信頼性対応フレームワークである textbfReLiF を提案する。
参考スコア（独自算出の注目度）: 0.32792728203318305
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Lipschitz-style individual fairness formalizes the idea that semantically similar examples should receive similar predictions, but its evaluation in multi-task learning (MTL) can be confounded by method-induced representation scales. This paper identifies threshold confounding: when the auditing tolerance is derived from each model's own representation distances, different algorithms are compared under different semantic thresholds. A threshold-drift analysis further shows how Bias rankings can change and identifies sufficient conditions for ranking preservation. We propose \textbf{ReLiF}, a reliability-aware framework that separates evaluation-time fixed-$δ$ auditing from training-time controlled regularization. ReLiF uses a shared reference tolerance for comparable auditing and a violation-rate feedback controller to keep the Lipschitz surrogate active without letting it dominate stochastic training. This work also develops supporting analysis for threshold drift, reference-tolerance selection, and the relationship between the huberized training surrogate and its unsmoothed positive-margin counterpart. Experiments on clinical time-series benchmarks and NYUv2 (NYU Depth V2) dense prediction show that fixed-$δ$ auditing exposes utility--fairness trade-offs that method-dependent thresholds can obscure. On NYUv2 with a ResNet50 backbone, ReLiF achieves competitive utility while substantially reducing aligned bias under shared fixed thresholds. On clinical benchmarks, ReLiF yields controlled fairness-regularized trade-offs, while fixed-$δ$ auditing reveals that task-balancing baselines can sometimes achieve lower bias and that genuine utility--fairness trade-offs persist. These results support fixed-$δ$ auditing as a semantically consistent protocol for evaluating Lipschitz fairness in MTL.
Abstract（参考訳）: リプシッツスタイルの個人フェアネスは、意味論的に類似した例が類似した予測を受けるべきだという考えを定式化するが、マルチタスク学習(MTL)におけるその評価は、方法によって引き起こされる表現尺度によって構成できる。本稿では,各モデルの表現距離から監査耐性を導出した場合,異なるセマンティックしきい値の下で異なるアルゴリズムを比較する。しきい値ドリフト分析は、バイアスランキングがどのように変化し、ランキング保存に十分な条件を識別するかをさらに示す。本稿では,評価時刻の固定値δ$監査をトレーニング時間制御正規化から分離する信頼性を考慮したフレームワークである‘textbf{ReLiF} を提案する。 ReLiFは、同等の監査のための共有参照寛容と違反率フィードバックコントローラを使用して、確率的トレーニングを優位にすることなく、リプシッツのサロゲートをアクティブに保つ。この研究は、しきい値のドリフト、基準耐性の選択、および偏化トレーニングサロゲートと非滑らかな正のマージンの関係に関する分析も開発した。臨床時系列ベンチマークとNYUv2(NYU Depth V2)の高密度予測実験は、固定$δ$監査が、メソッド依存しきい値が曖昧になるユーティリティー-フェアネストレードオフを公開することを示している。 ResNet50のバックボーンを持つNYUv2では、ReLiFは競合ユーティリティを実現すると同時に、共有された固定しきい値の下での整合バイアスを大幅に削減する。臨床ベンチマークでは、ReLiFは制御された公正な規則化されたトレードオフを得られるが、固定$δ$監査はタスクバランスベースラインが時にバイアスを低くし、真の実用性-公正トレードオフが持続することを示している。これらの結果は,MTLにおけるリプシッツの公平性を評価するための意味論的一貫したプロトコルとして固定$δ$監査をサポートする。

論文の概要: Is Fairness Truly Fair? Towards Reliable Lipschitz Fairness in Multi-Task Learning via Fixed-\texorpdfstring{$δ$}{delta} Alignment

関連論文リスト