Fugu-MT 論文翻訳(概要): Fair Data Representation for Machine Learning at the Pareto Frontier

論文の概要: Fair Data Representation for Machine Learning at the Pareto Frontier

arxiv url: http://arxiv.org/abs/2201.00292v1
Date: Sun, 2 Jan 2022 05:05:26 GMT
ステータス: 翻訳完了
システム内更新日: 2022-01-04 13:54:38.050129
Title: Fair Data Representation for Machine Learning at the Pareto Frontier
Title（参考訳）: パレートフロンティアにおける機械学習のための公正なデータ表現
Authors: Shizhou Xu, Thomas Strohmer
Abstract要約: 本稿では,L2-対象型教師付き学習アルゴリズムによる公平なデータ表現のための前処理アルゴリズムを提案する。学習結果境界からバリセンターへのワッセルシュタイン測地は,学習結果境界間のL2-ロスと総ワッセルシュタイン距離のフロンティアを特徴付けることを示す。
参考スコア（独自算出の注目度）: 1.5229257192293197
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As machine learning powered decision making is playing an increasingly important role in our daily lives, it is imperative to strive for fairness of the underlying data processing and algorithms. We propose a pre-processing algorithm for fair data representation via which L2- objective supervised learning algorithms result in an estimation of the Pareto frontier between prediction error and statistical disparity. In particular, the present work applies the optimal positive definite affine transport maps to approach the post-processing Wasserstein barycenter characterization of the optimal fair L2-objective supervised learning via a pre-processing data deformation. We call the resulting data Wasserstein pseudo-barycenter. Furthermore, we show that the Wasserstein geodesics from the learning outcome marginals to the barycenter characterizes the Pareto frontier between L2-loss and total Wasserstein distance among learning outcome marginals. Thereby, an application of McCann interpolation generalizes the pseudo-barycenter to a family of data representations via which L2-objective supervised learning algorithms result in the Pareto frontier. Numerical simulations underscore the advantages of the proposed data representation: (1) the pre-processing step is compositive with arbitrary L2-objective supervised learning methods and unseen data; (2) the fair representation protects data privacy by preventing the training machine from direct or indirect access to the sensitive information of the data; (3) the optimal affine map results in efficient computation of fair supervised learning on high-dimensional data; (4) experimental results shed light on the fairness of L2-objective unsupervised learning via the proposed fair data representation.
Abstract（参考訳）: 機械学習による意思決定は、日々の生活においてますます重要な役割を担っているため、基礎となるデータ処理とアルゴリズムの公平性を追求することが不可欠である。本稿では,L2-目的教師付き学習アルゴリズムを用いて予測誤差と統計的不一致の間のパレートフロンティアを推定するフェアデータ表現のための前処理アルゴリズムを提案する。特に,本研究は, 最適正定値アフィン輸送マップを用いて, 事前処理データ変形による最適公正なL2オブジェクト教師あり学習の特性評価にアプローチする。結果のデータをWasserstein pseudo-barycenterと呼びます。さらに,学習結果の辺縁部からバリーセンタまでのwasserstein測地線は,学習結果辺縁部におけるl2-lossと総wasserstein距離のparetoフロンティアを特徴付ける。これにより、McCann補間の適用は、L2オブジェクトの教師付き学習アルゴリズムがパレートフロンティアをもたらすデータ表現の族に擬似バリセンタを一般化する。 Numerical simulations underscore the advantages of the proposed data representation: (1) the pre-processing step is compositive with arbitrary L2-objective supervised learning methods and unseen data; (2) the fair representation protects data privacy by preventing the training machine from direct or indirect access to the sensitive information of the data; (3) the optimal affine map results in efficient computation of fair supervised learning on high-dimensional data; (4) experimental results shed light on the fairness of L2-objective unsupervised learning via the proposed fair data representation.

関連論文リスト

Sliced-Wasserstein Distance-based Data Selection [0.0]
本稿では,スライス-ワッサーシュタイン距離に基づく新しい教師なし異常検出手法を提案する。私たちのフィルタリング技術は、重要な分野に機械学習モデルをデプロイする意思決定パイプラインにとって興味深いものです。提案手法の合成データセットに対するフィルタリングパターンについて述べるとともに,データ選択の訓練のための数値的ベンチマークを行う。
論文参考訳（メタデータ） (2025-04-17T13:07:26Z)
Targeted Learning for Data Fairness [52.59573714151884]
データ生成プロセス自体の公平性を評価することにより、公平性推論を拡張する。我々は、人口統計学的平等、平等機会、条件付き相互情報から推定する。提案手法を検証するため,いくつかのシミュレーションを行い,実データに適用する。
論文参考訳（メタデータ） (2025-02-06T18:51:28Z)
Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
我々は、訓練中にデータポイントを除去する影響を定量化する、軌跡特異的な離脱の影響の概念を定式化する。軌道固有LOOの効率的な近似を可能にする新しい手法であるデータ値埋め込みを提案する。データバリューの埋め込みは、トレーニングデータの順序付けをキャプチャするので、モデルトレーニングのダイナミクスに関する貴重な洞察を提供する。
論文参考訳（メタデータ） (2024-12-12T18:28:55Z)
LAVA: Data Valuation without Pre-Specified Learning Algorithms [20.578106028270607]
我々は、下流学習アルゴリズムに不利な方法でトレーニングデータを評価できる新しいフレームワークを導入する。本研究では,訓練と検証セット間の非伝統的なクラスワイドワッサースタイン距離に基づいて,トレーニングセットに関連する検証性能のプロキシを開発する。距離は、特定のリプシッツ条件下での任意のモデルに対する検証性能の上限を特徴付けることを示す。
論文参考訳（メタデータ） (2023-04-28T19:05:16Z)
An Operational Perspective to Fairness Interventions: Where and How to Intervene [9.833760837977222]
フェアネス介入の評価と文脈化のための包括的枠組みを提案する。予測パリティに関するケーススタディで、我々のフレームワークを実証する。グループデータを使わずに予測パリティを実現することは困難である。
論文参考訳（メタデータ） (2023-02-03T07:04:33Z)
Fair Representation Learning using Interpolation Enabled Disentanglement [9.043741281011304]
a) 下流タスクに対する学習された表現の有用性を確保しつつ、公平な不整合表現を同時に学べるか、(b) 提案手法が公正かつ正確であるかどうかに関する理論的知見を提供する。前者に対応するために,補間可能外乱を用いた公正表現学習法FRIEDを提案する。
論文参考訳（メタデータ） (2021-07-31T17:32:12Z)
Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
データセットバイアスは、機械学習における不公平な原因の1つです。不確実性に基づくALで訓練されたモデルが保護クラスの決定において公平であるかどうかを検討する。また,勾配反転(GRAD)やBALDなどのアルゴリズム的公正性手法の相互作用についても検討する。
論文参考訳（メタデータ） (2021-04-14T14:20:22Z)
Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
そこで本稿では, 対実予測のための2次ロバスト表現を学習するための, スケーラブルな新しい手法を提案する。我々は、個々の治療効果と平均的な治療効果の両方に対して、堅牢で効率的な対実的予測を行う。このアルゴリズムは,実世界の最先端技術と合成データとの競合性能を示す。
論文参考訳（メタデータ） (2020-10-15T16:39:26Z)
Evaluating representations by the complexity of learning low-loss predictors [55.94170724668857]
下流タスクの解決に使用されるデータの表現を評価することの問題点を考察する。本稿では,関心のあるタスクにおける低損失を実現する表現の上に,予測器を学習する複雑性によって表現の質を測定することを提案する。
論文参考訳（メタデータ） (2020-09-15T22:06:58Z)
Graph Embedding with Data Uncertainty [113.39838145450007]
スペクトルベースのサブスペース学習は、多くの機械学習パイプラインにおいて、一般的なデータ前処理ステップである。ほとんどの部分空間学習法は、不確実性の高いデータにつながる可能性のある測定の不正確さやアーティファクトを考慮していない。
論文参考訳（メタデータ） (2020-09-01T15:08:23Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
分散ロバストな最適化フレームワークはパラメトリックモデルのトレーニングのために検討されている。目的は、逆操作された入力データに対して頑健なトレーニングモデルを提供することである。提案されたアルゴリズムは、オーバーヘッドがほとんどない堅牢性を提供する。
論文参考訳（メタデータ） (2020-07-07T18:25:25Z)
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data [135.64775986546505]
オフラインで収集されたデータセット(観測データ)を組み込んで、オンライン環境でのサンプル効率を改善する方法について検討する。提案手法は,観測データを効率よく組み込んだ,分解された楽観的値反復 (DOVI) アルゴリズムを提案する。
論文参考訳（メタデータ） (2020-06-22T14:49:33Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。