Constant-Factor Approximation Algorithms for Socially Fair
$k$-Clustering
- URL: http://arxiv.org/abs/2206.11210v1
- Date: Wed, 22 Jun 2022 16:57:17 GMT
- Title: Constant-Factor Approximation Algorithms for Socially Fair
$k$-Clustering
- Authors: Mehrdad Ghadiri, Mohit Singh, Santosh S. Vempala
- Abstract summary: We study approximation algorithms for the socially fair $(ell_p, k)$-clustering problem $m$ groups.
We present (1) a-time $(5+2sqrt6)pimation at most $k+m$ centers (2) a $(5+2sqrt6)pimation with $k$ centers in time $n2O(p)cdot m2$, and (3) a $(15+6sqrt6)p$ approximation with $k$ centers in
- Score: 12.545909976356528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study approximation algorithms for the socially fair $(\ell_p,
k)$-clustering problem with $m$ groups, whose special cases include the
socially fair $k$-median ($p=1$) and socially fair $k$-means ($p=2$) problems.
We present (1) a polynomial-time $(5+2\sqrt{6})^p$-approximation with at most
$k+m$ centers (2) a $(5+2\sqrt{6}+\epsilon)^p$-approximation with $k$ centers
in time $n^{2^{O(p)}\cdot m^2}$, and (3) a $(15+6\sqrt{6})^p$ approximation
with $k$ centers in time $k^{m}\cdot\text{poly}(n)$. The first result is
obtained via a refinement of the iterative rounding method using a sequence of
linear programs. The latter two results are obtained by converting a solution
with up to $k+m$ centers to one with $k$ centers using sparsification methods
for (2) and via an exhaustive search for (3). We also compare the performance
of our algorithms with existing bicriteria algorithms as well as exactly $k$
center approximation algorithms on benchmark datasets, and find that our
algorithms also outperform existing methods in practice.
Related papers
- A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering [10.697784653113095]
We study pairwise fair clustering with $ell ge 2$ groups, where for every cluster $C$ and every group $i in [ell]$, the number of points in $C$ from group $i$ must be at most $t times the number of points in $C$ from any other group $j in [ell]$.
We show that our problem even when $ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no-time algorithm with an approximation factor of $o
arXiv Detail & Related papers (2024-05-16T18:17:44Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Diversity-aware clustering: Computational Complexity and Approximation
Algorithms [19.67390261007849]
We study diversity-aware clustering problems where the data points are associated with multiple attributes resulting in intersecting groups.
A clustering solution need to ensure that a minimum number of cluster centers are chosen from each group.
We present parameterized approximation algorithms with approximation ratios $1+ frac2e$, $1+frac8e$ and $3$.
arXiv Detail & Related papers (2024-01-10T19:01:05Z) - Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets.
We present an improved version of the "$q$-means" algorithm for clustering.
We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Near-Optimal Quantum Coreset Construction Algorithms for Clustering [15.513270929560088]
We give quantum algorithms that find coresets for $k$-clustering in $mathbbRd$ with $tildeO(sqrtnkd3/2)$ query complexity.
Our coreset reduces the input size from $n$ to $mathrmpoly(kepsilon-1d)$, so that existing $alpha$-approximation algorithms for clustering can run on top of it.
arXiv Detail & Related papers (2023-06-05T12:22:46Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.