Approximation Algorithms for Socially Fair Clustering
- URL: http://arxiv.org/abs/2103.02512v1
- Date: Wed, 3 Mar 2021 16:36:21 GMT
- Title: Approximation Algorithms for Socially Fair Clustering
- Authors: Yury Makarychev and Ali Vakilian
- Abstract summary: We present an $(eO(p) fraclog ellloglogell)$-approximation algorithm for socially fair clustering with the $ell_p$-objective.
- Score: 12.63013063147516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an $(e^{O(p)} \frac{\log \ell}{\log\log\ell})$-approximation
algorithm for socially fair clustering with the $\ell_p$-objective. In this
problem, we are given a set of points in a metric space. Each point belongs to
one (or several) of $\ell$ groups. The goal is to find a $k$-medians,
$k$-means, or, more generally, $\ell_p$-clustering that is simultaneously good
for all of the groups. More precisely, we need to find a set of $k$ centers $C$
so as to minimize the maximum over all groups $j$ of $\sum_{u \text{ in group
}j} d(u,C)^p$. The socially fair clustering problem was independently proposed
by Abbasi, Bhaskara, and Venkatasubramanian [2021] and Ghadiri, Samadi, and
Vempala [2021]. Our algorithm improves and generalizes their
$O(\ell)$-approximation algorithms for the problem. The natural LP relaxation
for the problem has an integrality gap of $\Omega(\ell)$. In order to obtain
our result, we introduce a strengthened LP relaxation and show that it has an
integrality gap of $\Theta(\frac{\log \ell}{\log\log\ell})$ for a fixed $p$.
Additionally, we present a bicriteria approximation algorithm, which
generalizes the bicriteria approximation of Abbasi et al. [2021].
Related papers
- A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering [10.697784653113095]
We study pairwise fair clustering with $ell ge 2$ groups, where for every cluster $C$ and every group $i in [ell]$, the number of points in $C$ from group $i$ must be at most $t times the number of points in $C$ from any other group $j in [ell]$.
We show that our problem even when $ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no-time algorithm with an approximation factor of $o
arXiv Detail & Related papers (2024-05-16T18:17:44Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets.
We present an improved version of the "$q$-means" algorithm for clustering.
We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Constant-Factor Approximation Algorithms for Socially Fair
$k$-Clustering [12.545909976356528]
We study approximation algorithms for the socially fair $(ell_p, k)$-clustering problem $m$ groups.
We present (1) a-time $(5+2sqrt6)pimation at most $k+m$ centers (2) a $(5+2sqrt6)pimation with $k$ centers in time $n2O(p)cdot m2$, and (3) a $(15+6sqrt6)p$ approximation with $k$ centers in
arXiv Detail & Related papers (2022-06-22T16:57:17Z) - Approximating Fair Clustering with Cascaded Norm Objectives [10.69111036810888]
We find a clustering which minimizes the $ell_q$-norm of the vector over $W$ of the $ell_p$-norms of the weighted distances of points in $P$ from the centers.
This generalizes various clustering problems, including Socially Fair $k$-Median and $k$-Means.
arXiv Detail & Related papers (2021-11-08T20:18:10Z) - Near-optimal Algorithms for Explainable k-Medians and k-Means [12.68470213641421]
We consider the problem of explainable $k$-medians and $k$-means introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian(ICML 2020)
Our goal is to find a emphoptimal decision tree that partitions data into $k clusters and minimizes the $k-medians or $k-means objective.
arXiv Detail & Related papers (2021-07-02T02:07:12Z) - Improved Approximation Algorithms for Individually Fair Clustering [9.914246432182873]
We present an improved $(16p +varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost.
Our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al.
arXiv Detail & Related papers (2021-06-26T15:22:52Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Tight Quantum Lower Bound for Approximate Counting with Quantum States [49.6558487240078]
We prove tight lower bounds for the following variant of the counting problem considered by Aaronson, Kothari, Kretschmer, and Thaler ( 2020)
The task is to distinguish whether an input set $xsubseteq [n]$ has size either $k$ or $k'=(1+varepsilon)k$.
arXiv Detail & Related papers (2020-02-17T10:53:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.