Fugu-MT 論文翻訳(概要): Offline Evaluation Measures of Fairness in Recommender Systems

論文の概要: Offline Evaluation Measures of Fairness in Recommender Systems

arxiv url: http://arxiv.org/abs/2604.25032v1
Date: Mon, 27 Apr 2026 22:28:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-29 16:49:17.61848
Title: Offline Evaluation Measures of Fairness in Recommender Systems
Title（参考訳）: リコメンダシステムにおける公正性のオフライン評価対策
Authors: Theresia Veronika Rampisela,
Abstract要約: この論文は、既存の推奨システムフェアネス評価尺度の様々な理論的、実証的、概念的制限を評価し、克服する一連の論文を提示する。まず,その解釈可能性,表現性,適用性を制限する欠陥を露呈し,測定方法に関する理論的,実証的な分析を行う。最後に,適切な測定方法のガイドラインを推薦し,実用シナリオにおける公平性評価尺度のより正確な選択を可能にする。
参考スコア（独自算出の注目度）: 0.6345523830122167
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The evaluation of recommender system fairness has become increasingly important, especially with recent legislation that emphasises the development of fair and responsible artificial intelligence. This has led to the emergence of various fairness evaluation measures, which quantify fairness based on different definitions. However, many of such measures are simply proposed and used without further analysis on their robustness. As a result, there is insufficient understanding and awareness of the measures' limitations. Among other issues, it is not known what kind of model outputs produce the (un)fairest score, how the measure scores are empirically distributed, and whether there are cases where the measures cannot be computed (e.g., due to division by zero). These issues cause difficulty in interpreting the measure scores and confusion on which measure(s) should be used for a specific case. This thesis presents a series of papers that assess and overcome various theoretical, empirical, and conceptual limitations of existing recommender system fairness evaluation measures. We investigate a wide range of offline evaluation measures for different fairness notions, divided based on the evaluation subjects (users and items) and for different evaluation granularities (groups of subjects and individual subjects). Firstly, we perform theoretical and empirical analysis on the measures, exposing flaws that limit their interpretability, expressiveness, or applicability. Secondly, we contribute novel evaluation approaches and measures that overcome these limitations. Finally, considering the measures' limitations, we recommend guidelines for the appropriate measure usage, thereby allowing for more precise selection of fairness evaluation measures in practical scenarios. Overall, this thesis contributes to advancing the state-of-the-art offline evaluation of fairness in recommender systems.
Abstract（参考訳）: 特に、公正で責任ある人工知能の開発を強調する最近の法律では、リコメンダシステムフェアネスの評価がますます重要になっている。これは様々なフェアネス評価尺度の出現につながり、異なる定義に基づいてフェアネスを定量化する。しかし、そのような尺度の多くは単に提案され、その堅牢性についてさらなる分析をすることなく用いられる。その結果,対策の限界に対する理解と認識は不十分であった。その他の問題として、どのモデル出力が(不)fairestスコアを生成するのか、どのように測定スコアが実験的に分配されるのか、また、その測度が計算できないケースがある(例えば、0の除算による除算)。これらの問題は、測度スコアの解釈が困難となり、特定のケースでどの測度を使用するべきかを混乱させる。この論文は、既存の推奨システムフェアネス評価尺度の様々な理論的、実証的、概念的制限を評価し、克服する一連の論文を提示する。評価対象(ユーザとアイテム)と評価対象(個人と個人)の粒度(グループ)に基づいて,さまざまな公平性概念に対する幅広いオフライン評価尺度について検討した。まず,その解釈可能性,表現性,適用性を制限する欠陥を露呈し,測定方法に関する理論的,実証的な分析を行う。第2に,これらの制約を克服する新たな評価手法と対策を提案する。最後に, 対策の限界を考慮し, 適切な尺度使用に関するガイドラインを推奨し, 実践シナリオにおける公平性評価尺度のより正確な選択を可能にする。全体として、この論文はレコメンデータシステムにおける公正性の最先端のオフライン評価の進展に寄与する。

論文の概要: Offline Evaluation Measures of Fairness in Recommender Systems

関連論文リスト