Improved Approximation Algorithms for Individually Fair Clustering
- URL: http://arxiv.org/abs/2106.14043v1
- Date: Sat, 26 Jun 2021 15:22:52 GMT
- Title: Improved Approximation Algorithms for Individually Fair Clustering
- Authors: Ali Vakilian, Mustafa Yal\c{c}{\i}ner
- Abstract summary: We present an improved $(16p +varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost.
Our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al.
- Score: 9.914246432182873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the $k$-clustering problem with $\ell_p$-norm cost, which
includes $k$-median, $k$-means and $k$-center cost functions, under an
individual notion of fairness proposed by Jung et al. [2020]: given a set of
points $P$ of size $n$, a set of $k$ centers induces a fair clustering if for
every point $v\in P$, $v$ can find a center among its $n/k$ closest neighbors.
Recently, Mahabadi and Vakilian [2020] showed how to get a
$(p^{O(p)},7)$-bicriteria approximation for the problem of fair $k$-clustering
with $\ell_p$-norm cost: every point finds a center within distance at most $7$
times its distance to its $(n/k)$-th closest neighbor and the $\ell_p$-norm
cost of the solution is at most $p^{O(p)}$ times the cost of an optimal fair
solution. In this work, for any $\varepsilon>0$, we present an improved $(16^p
+\varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with
$\ell_p$-norm cost. To achieve our guarantees, we extend the framework of
[Charikar et al., 2002, Swamy, 2016] and devise a $16^p$-approximation
algorithm for the facility location with $\ell_p$-norm cost under matroid
constraint which might be of an independent interest. Besides, our approach
suggests a reduction from our individually fair clustering to a clustering with
a group fairness requirement proposed by Kleindessner et al. [2019], which is
essentially the median matroid problem [Krishnaswamy et al., 2011].
Related papers
- Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems [8.74967598360817]
Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group.
We propose a novel Relax and Merge'' framework, where $rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm.
If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(epsilon))$ with only a slight violation of the fairness constraints.
arXiv Detail & Related papers (2024-11-02T02:50:12Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets.
We present an improved version of the "$q$-means" algorithm for clustering.
We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Approximation Algorithms for Fair Range Clustering [14.380145034918158]
This paper studies the fair range clustering problem in which the data points are from different demographic groups.
The goal is to pick $k$ centers with the minimum clustering cost such that each group is at least minimally represented in the centers set.
In particular, the fair range $ell_p$-clustering captures fair range $k$-center, $k$-median and $k$-means as its special cases.
arXiv Detail & Related papers (2023-06-11T21:18:40Z) - Mind the gap: Achieving a super-Grover quantum speedup by jumping to the
end [114.3957763744719]
We present a quantum algorithm that has rigorous runtime guarantees for several families of binary optimization problems.
We show that the algorithm finds the optimal solution in time $O*(2(0.5-c)n)$ for an $n$-independent constant $c$.
We also show that for a large fraction of random instances from the $k$-spin model and for any fully satisfiable or slightly frustrated $k$-CSP formula, statement (a) is the case.
arXiv Detail & Related papers (2022-12-03T02:45:23Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Approximating Fair Clustering with Cascaded Norm Objectives [10.69111036810888]
We find a clustering which minimizes the $ell_q$-norm of the vector over $W$ of the $ell_p$-norms of the weighted distances of points in $P$ from the centers.
This generalizes various clustering problems, including Socially Fair $k$-Median and $k$-Means.
arXiv Detail & Related papers (2021-11-08T20:18:10Z) - Better Algorithms for Individually Fair $k$-Clustering [4.936817539835045]
Jung, Kannan, and Lutz [FORC 2020] introduced a clustering algorithm with provable (approximate) fairness and objective guarantees for the $ell_infty$ or $k$-Center objective. Mahabadi and Vakilian [ICML 2020] revisited this problem to give a local-search algorithm for all $ell_p$-norms.
We prove that by modifying known LP rounding techniques, one gets a worst-case guarantee on the objective which is much better than in MV20, and empirically, this objective is extremely close to the optimal.
arXiv Detail & Related papers (2021-06-23T04:16:46Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Individual Fairness for $k$-Clustering [16.988236398003068]
Local search based algorithm for $k$-median and $k$-means.
We show how to get a bicriteria approximation for fair $k$-clustering.
arXiv Detail & Related papers (2020-02-17T02:31:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.