Fair for All: Best-effort Fairness Guarantees for Classification
- URL: http://arxiv.org/abs/2012.10216v4
- Date: Wed, 24 Feb 2021 10:54:11 GMT
- Title: Fair for All: Best-effort Fairness Guarantees for Classification
- Authors: Anilesh K. Krishnaswamy, Zhihao Jiang, Kangning Wang, Yu Cheng, and
Kamesh Munagala
- Abstract summary: Group-based notions of fairness try to equalize absolute measures of performance across known groups.
We propose a notion whose guarantee, on each group $g$ in a class $mathcalG$, is relative to the performance of the best classifier on $g$.
We test our algorithms on real-world data sets, and present interesting comparative insights on their performance.
- Score: 11.818794470895837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard approaches to group-based notions of fairness, such as \emph{parity}
and \emph{equalized odds}, try to equalize absolute measures of performance
across known groups (based on race, gender, etc.). Consequently, a group that
is inherently harder to classify may hold back the performance on other groups;
and no guarantees can be provided for unforeseen groups. Instead, we propose a
fairness notion whose guarantee, on each group $g$ in a class $\mathcal{G}$, is
relative to the performance of the best classifier on $g$. We apply this notion
to broad classes of groups, in particular, where (a) $\mathcal{G}$ consists of
all possible groups (subsets) in the data, and (b) $\mathcal{G}$ is more
streamlined.
For the first setting, which is akin to groups being completely unknown, we
devise the {\sc PF} (Proportional Fairness) classifier, which guarantees, on
any possible group $g$, an accuracy that is proportional to that of the optimal
classifier for $g$, scaled by the relative size of $g$ in the data set. Due to
including all possible groups, some of which could be too complex to be
relevant, the worst-case theoretical guarantees here have to be proportionally
weaker for smaller subsets.
For the second setting, we devise the {\sc BeFair} (Best-effort Fair)
framework which seeks an accuracy, on every $g \in \mathcal{G}$, which
approximates that of the optimal classifier on $g$, independent of the size of
$g$. Aiming for such a guarantee results in a non-convex problem, and we design
novel techniques to get around this difficulty when $\mathcal{G}$ is the set of
linear hypotheses. We test our algorithms on real-world data sets, and present
interesting comparative insights on their performance.
Related papers
- Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems [8.74967598360817]
Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group.
We propose a novel Relax and Merge'' framework, where $rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm.
If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(epsilon))$ with only a slight violation of the fairness constraints.
arXiv Detail & Related papers (2024-11-02T02:50:12Z) - The Real Price of Bandit Information in Multiclass Classification [73.17969992976501]
We revisit the classical problem of multiclass classification with bandit feedback.
We present a new bandit classification algorithm that guarantees regret $smashwidetildeO(|H|+sqrtT)$.
arXiv Detail & Related papers (2024-05-16T12:11:09Z) - Fair Active Ranking from Pairwise Preferences [6.102498508368527]
Given a set of $n$ items that belong to disjoint groups, our goal is to find an $(epsilon, delta)$-PACF-Ranking according to a fair objective function that we propose.
We present both group-blind and group-aware algorithms and analyze their sample parameters.
arXiv Detail & Related papers (2024-02-05T18:09:48Z) - A Unified Framework for Gradient-based Clustering of Distributed Data [51.904327888475606]
We develop a family of distributed clustering algorithms that work over networks of users.
DGC-$mathcalF_rho$ is specialized to popular clustering losses like $K$-means and Huber loss.
We show that consensus fixed points of DGC-$mathcalF_rho$ are equivalent to fixed points of gradient clustering over the full data.
arXiv Detail & Related papers (2024-02-02T10:44:42Z) - Agnostic Multi-Group Active Learning [24.97598179536084]
We consider a variant of this problem from the perspective of active learning, where the learner is endowed with the power to decide which examples are labeled from each distribution in the collection.
Our main challenge is that standard active learning techniques such as disagreement-based active learning do not directly apply to the multi-group learning objective.
We modify existing algorithms to provide a consistent active learning algorithm for an agnostic formulation of multi-group learning.
arXiv Detail & Related papers (2023-06-02T21:24:13Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Metric-Fair Classifier Derandomization [6.269732593554894]
We study the problem of classifier derandomization in machine learning.
We show that the prior derandomization approach is almost maximally metric-unfair.
We devise a derandomization procedure that provides an appealing tradeoff between these two.
arXiv Detail & Related papers (2022-06-15T21:36:57Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Multi-group Agnostic PAC Learnability [7.9649015115693444]
We study "multi-group agnostic PAC learnability"
We provide a characterization of the loss functions for which such a predictor is guaranteed to exist.
Our results unify and extend previous positive and negative results from the multi-group fairness literature.
arXiv Detail & Related papers (2021-05-20T18:43:36Z) - Robust Optimization for Fairness with Noisy Protected Groups [85.13255550021495]
We study the consequences of naively relying on noisy protected group labels.
We introduce two new approaches using robust optimization.
We show that the robust approaches achieve better true group fairness guarantees than the naive approach.
arXiv Detail & Related papers (2020-02-21T14:58:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.