Individual Fairness for $k$-Clustering
- URL: http://arxiv.org/abs/2002.06742v2
- Date: Mon, 21 Sep 2020 18:31:09 GMT
- Title: Individual Fairness for $k$-Clustering
- Authors: Sepideh Mahabadi and Ali Vakilian
- Abstract summary: Local search based algorithm for $k$-median and $k$-means.
We show how to get a bicriteria approximation for fair $k$-clustering.
- Score: 16.988236398003068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give a local search based algorithm for $k$-median and $k$-means (and more
generally for any $k$-clustering with $\ell_p$ norm cost function) from the
perspective of individual fairness. More precisely, for a point $x$ in a point
set $P$ of size $n$, let $r(x)$ be the minimum radius such that the ball of
radius $r(x)$ centered at $x$ has at least $n/k$ points from $P$. Intuitively,
if a set of $k$ random points are chosen from $P$ as centers, every point $x\in
P$ expects to have a center within radius $r(x)$. An individually fair
clustering provides such a guarantee for every point $x\in P$. This notion of
fairness was introduced in [Jung et al., 2019] where they showed how to get an
approximately feasible $k$-clustering with respect to this fairness condition.
In this work, we show how to get a bicriteria approximation for fair
$k$-clustering: The $k$-median ($k$-means) cost of our solution is within a
constant factor of the cost of an optimal fair $k$-clustering, and our solution
approximately satisfies the fairness condition (also within a constant factor).
Further, we complement our theoretical bounds with empirical evaluation.
Related papers
- Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems [8.74967598360817]
Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group.
We propose a novel Relax and Merge'' framework, where $rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm.
If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(epsilon))$ with only a slight violation of the fairness constraints.
arXiv Detail & Related papers (2024-11-02T02:50:12Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - $L^1$ Estimation: On the Optimality of Linear Estimators [64.76492306585168]
This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian.
In particular, it is demonstrated that if the conditional distribution $P_X|Y=y$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution.
arXiv Detail & Related papers (2023-09-17T01:45:13Z) - Approximation Algorithms for Fair Range Clustering [14.380145034918158]
This paper studies the fair range clustering problem in which the data points are from different demographic groups.
The goal is to pick $k$ centers with the minimum clustering cost such that each group is at least minimally represented in the centers set.
In particular, the fair range $ell_p$-clustering captures fair range $k$-center, $k$-median and $k$-means as its special cases.
arXiv Detail & Related papers (2023-06-11T21:18:40Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Improved Approximation Algorithms for Individually Fair Clustering [9.914246432182873]
We present an improved $(16p +varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost.
Our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al.
arXiv Detail & Related papers (2021-06-26T15:22:52Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - FPT Approximation for Socially Fair Clustering [0.38073142980733]
We are given a set of points $P$ in a metric space $mathcalX$ with a distance function $d(.,.)$.
The goal of the socially fair $k$-median problem is to find a set $C subseteq F$ of $k$ centers that minimizes the maximum average cost over all the groups.
In this work, we design $(5+varepsilon)$ and $(33 + varepsilon)$ approximation algorithms for the socially fair $k$-median and $k$-means
arXiv Detail & Related papers (2021-06-12T11:53:18Z) - A New Notion of Individually Fair Clustering: $\alpha$-Equitable
$k$-Center [31.936733991407134]
We introduce a novel definition of fairness for clustering problems.
In our model each point $j$ has a set of other points $mathcalS_j$ that it perceives as similar to itself.
We provide efficient and easily implementable approximation algorithms for the $k$-center objective.
arXiv Detail & Related papers (2021-06-09T22:52:00Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.