Related papers: Approximating Fair Clustering with Cascaded Norm Objectives

Approximating Fair Clustering with Cascaded Norm Objectives

URL: http://arxiv.org/abs/2111.04804v1
Date: Mon, 8 Nov 2021 20:18:10 GMT
Title: Approximating Fair Clustering with Cascaded Norm Objectives
Authors: Eden Chlamt\'a\v{c}, Yury Makarychev, Ali Vakilian
Abstract summary: We find a clustering which minimizes the $ell_q$-norm of the vector over $W$ of the $ell_p$-norms of the weighted distances of points in $P$ from the centers. This generalizes various clustering problems, including Socially Fair $k$-Median and $k$-Means.
Score: 10.69111036810888
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce the $(p,q)$-Fair Clustering problem. In this problem, we are given a set of points $P$ and a collection of different weight functions $W$. We would like to find a clustering which minimizes the $\ell_q$-norm of the vector over $W$ of the $\ell_p$-norms of the weighted distances of points in $P$ from the centers. This generalizes various clustering problems, including Socially Fair $k$-Median and $k$-Means, and is closely connected to other problems such as Densest $k$-Subgraph and Min $k$-Union. We utilize convex programming techniques to approximate the $(p,q)$-Fair Clustering problem for different values of $p$ and $q$. When $p\geq q$, we get an $O(k^{(p-q)/(2pq)})$, which nearly matches a $k^{\Omega((p-q)/(pq))}$ lower bound based on conjectured hardness of Min $k$-Union and other problems. When $q\geq p$, we get an approximation which is independent of the size of the input for bounded $p,q$, and also matches the recent $O((\log n/(\log\log n))^{1/p})$-approximation for $(p, \infty)$-Fair Clustering by Makarychev and Vakilian (COLT 2021).

Related papers

On Approximability of $\ell_2^2$ Min-Sum Clustering [17.403803548692498]
It is not known whether it is NP-hard to approximate $ell2$ min-sum $k$-clustering beyond a certain factor. We show that it is NP-hard to approximate the objective to a factor better than $1.056$. We then complement our hardness result by giving a nearly linear time parameterized PTAS for $ell2$ min-sum $k$-clustering running in time.
arXiv Detail & Related papers (2024-12-04T14:03:27Z)
Clustering to Minimize Cluster-Aware Norm Objectives [0.3481985817302898]
We seek to partition a given set $P$ of data points into $k$ clusters by finding a set $X$ of $k$ centers. The cost of a cluster, represented by a center $xin X$, is a monotone, symmetric norm $f$ (inner norm) of the vector of distances of points assigned to $x$. The goal is to minimize a norm $g$ (outer norm) of the vector of cluster costs.
arXiv Detail & Related papers (2024-10-31T16:33:40Z)
The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$. As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z)
A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering [10.697784653113095]
We study pairwise fair clustering with $ell ge 2$ groups, where for every cluster $C$ and every group $i in [ell]$, the number of points in $C$ from group $i$ must be at most $t times the number of points in $C$ from any other group $j in [ell]$. We show that our problem even when $ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no-time algorithm with an approximation factor of $o
arXiv Detail & Related papers (2024-05-16T18:17:44Z)
A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al. A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$. We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z)
Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets. We present an improved version of the "$q$-means" algorithm for clustering. We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z)
$\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits. For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z)
Approximation Algorithms for Fair Range Clustering [14.380145034918158]
This paper studies the fair range clustering problem in which the data points are from different demographic groups. The goal is to pick $k$ centers with the minimum clustering cost such that each group is at least minimally represented in the centers set. In particular, the fair range $ell_p$-clustering captures fair range $k$-center, $k$-median and $k$-means as its special cases.
arXiv Detail & Related papers (2023-06-11T21:18:40Z)
Fair Representation Clustering with Several Protected Classes [13.53362222844008]
We study the problem of fair $k$-median where each cluster is required to have a fair representation of individuals from different groups. We present an $O(log k)$-approximation algorithm that runs in time $nO(ell)$.
arXiv Detail & Related papers (2022-02-03T03:45:45Z)
Local Correlation Clustering with Asymmetric Classification Errors [12.277755088736864]
In the Correlation Clustering problem, we are given a complete weighted graph $G$ with its edges labeled as "similar" and "dissimilar" For a clustering $mathcalC$ of graph $G$, a similar edge is in disagreement with $mathcalC$, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with $mathcalC$ if its endpoints belong to the same cluster. We produce a clustering that minimizes the $ell_p$ norm of the disagreements vector for $pgeq 1$
arXiv Detail & Related papers (2021-08-11T12:31:48Z)
Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$. We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z)
An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z)
Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution. We show that the constraint of local differential privacy incurs an exponential increase in cost. Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems. We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.