Approximation Algorithms for Fair Range Clustering
- URL: http://arxiv.org/abs/2306.06778v2
- Date: Thu, 22 Jun 2023 05:19:54 GMT
- Title: Approximation Algorithms for Fair Range Clustering
- Authors: S\`edjro S. Hotegni and Sepideh Mahabadi and Ali Vakilian
- Abstract summary: This paper studies the fair range clustering problem in which the data points are from different demographic groups.
The goal is to pick $k$ centers with the minimum clustering cost such that each group is at least minimally represented in the centers set.
In particular, the fair range $ell_p$-clustering captures fair range $k$-center, $k$-median and $k$-means as its special cases.
- Score: 14.380145034918158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the fair range clustering problem in which the data points
are from different demographic groups and the goal is to pick $k$ centers with
the minimum clustering cost such that each group is at least minimally
represented in the centers set and no group dominates the centers set. More
precisely, given a set of $n$ points in a metric space $(P,d)$ where each point
belongs to one of the $\ell$ different demographics (i.e., $P = P_1 \uplus P_2
\uplus \cdots \uplus P_\ell$) and a set of $\ell$ intervals $[\alpha_1,
\beta_1], \cdots, [\alpha_\ell, \beta_\ell]$ on desired number of centers from
each group, the goal is to pick a set of $k$ centers $C$ with minimum
$\ell_p$-clustering cost (i.e., $(\sum_{v\in P} d(v,C)^p)^{1/p}$) such that for
each group $i\in \ell$, $|C\cap P_i| \in [\alpha_i, \beta_i]$. In particular,
the fair range $\ell_p$-clustering captures fair range $k$-center, $k$-median
and $k$-means as its special cases. In this work, we provide efficient constant
factor approximation algorithms for fair range $\ell_p$-clustering for all
values of $p\in [1,\infty)$.
Related papers
- LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions [54.54897832889028]
We show that for any $K$, there is a universal set" $U subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_i,j$ in row $i$ of $A$ all have $jin U$.
We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well.
arXiv Detail & Related papers (2024-10-07T19:47:13Z) - A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering [10.697784653113095]
We study pairwise fair clustering with $ell ge 2$ groups, where for every cluster $C$ and every group $i in [ell]$, the number of points in $C$ from group $i$ must be at most $t times the number of points in $C$ from any other group $j in [ell]$.
We show that our problem even when $ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no-time algorithm with an approximation factor of $o
arXiv Detail & Related papers (2024-05-16T18:17:44Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - A Unified Framework for Gradient-based Clustering of Distributed Data [51.904327888475606]
We develop a family of distributed clustering algorithms that work over networks of users.
DGC-$mathcalF_rho$ is specialized to popular clustering losses like $K$-means and Huber loss.
We show that consensus fixed points of DGC-$mathcalF_rho$ are equivalent to fixed points of gradient clustering over the full data.
arXiv Detail & Related papers (2024-02-02T10:44:42Z) - Fair Representation Clustering with Several Protected Classes [13.53362222844008]
We study the problem of fair $k$-median where each cluster is required to have a fair representation of individuals from different groups.
We present an $O(log k)$-approximation algorithm that runs in time $nO(ell)$.
arXiv Detail & Related papers (2022-02-03T03:45:45Z) - Approximating Fair Clustering with Cascaded Norm Objectives [10.69111036810888]
We find a clustering which minimizes the $ell_q$-norm of the vector over $W$ of the $ell_p$-norms of the weighted distances of points in $P$ from the centers.
This generalizes various clustering problems, including Socially Fair $k$-Median and $k$-Means.
arXiv Detail & Related papers (2021-11-08T20:18:10Z) - FriendlyCore: Practical Differentially Private Aggregation [67.04951703461657]
We propose a simple and practical tool $mathsfFriendlyCore$ that takes a set of points $cal D$ from an unrestricted (pseudo) metric space as input.
When $cal D$ has effective diameter $r$, $mathsfFriendlyCore$ returns a "stable" subset $cal D_Gsubseteq cal D$ that includes all points.
$mathsfFriendlyCore$ can be used to preprocess the input before privately aggregating it, potentially simplifying the aggregation or boosting its accuracy
arXiv Detail & Related papers (2021-10-19T17:43:50Z) - Improved Approximation Algorithms for Individually Fair Clustering [9.914246432182873]
We present an improved $(16p +varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost.
Our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al.
arXiv Detail & Related papers (2021-06-26T15:22:52Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z) - Individual Fairness for $k$-Clustering [16.988236398003068]
Local search based algorithm for $k$-median and $k$-means.
We show how to get a bicriteria approximation for fair $k$-clustering.
arXiv Detail & Related papers (2020-02-17T02:31:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.