A New Notion of Individually Fair Clustering: $\alpha$-Equitable
$k$-Center
- URL: http://arxiv.org/abs/2106.05423v1
- Date: Wed, 9 Jun 2021 22:52:00 GMT
- Title: A New Notion of Individually Fair Clustering: $\alpha$-Equitable
$k$-Center
- Authors: Darshan Chakrabarti, John P. Dickerson, Seyed A. Esmaeili, Aravind
Srinivasan, Leonidas Tsepenekas
- Abstract summary: We introduce a novel definition of fairness for clustering problems.
In our model each point $j$ has a set of other points $mathcalS_j$ that it perceives as similar to itself.
We provide efficient and easily implementable approximation algorithms for the $k$-center objective.
- Score: 31.936733991407134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustering is a fundamental problem in unsupervised machine learning, and
fair variants of it have recently received significant attention. In this work
we introduce a novel definition of fairness for clustering problems.
Specifically, in our model each point $j$ has a set of other points
$\mathcal{S}_j$ that it perceives as similar to itself, and it feels that it is
fairly treated, if the quality of service it receives in the solution is
$\alpha$-close to that of the points in $\mathcal{S}_j$. We begin our study by
answering questions regarding the structure of the problem, namely for what
values of $\alpha$ the problem is well-defined, and what the behavior of the
Price of Fairness (PoF) for it is. For the well-defined region of $\alpha$, we
provide efficient and easily implementable approximation algorithms for the
$k$-center objective, which in certain cases also enjoy bounded PoF guarantees.
We finally complement our analysis by an extensive suite of experiments that
validates the effectiveness of our theoretical results.
Related papers
- Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems [8.74967598360817]
Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group.
We propose a novel Relax and Merge'' framework, where $rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm.
If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(epsilon))$ with only a slight violation of the fairness constraints.
arXiv Detail & Related papers (2024-11-02T02:50:12Z) - Optimal level set estimation for non-parametric tournament and crowdsourcing problems [49.75262185577198]
Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of $n$ experts on $d$ questions.
In this paper, we assume that the matrix $M$ containing the probability that expert $i$ answers correctly to question $j$ is bi-isotonic up to a permutation of it rows and columns.
We construct an efficient-time algorithm that turns out to be minimax optimal for this classification problem.
arXiv Detail & Related papers (2024-08-27T18:28:31Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Improved Learning-augmented Algorithms for k-means and k-medians
Clustering [8.04779839951237]
We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space.
We propose a deterministic $k$-means algorithm that produces centers with improved bound on clustering cost.
Our algorithm works even when the predictions are not very accurate, i.e. our bound holds for $alpha$ up to $1/2$, an improvement over $alpha$ being at most $1/7$ in the previous work.
arXiv Detail & Related papers (2022-10-31T03:00:11Z) - Sharper Rates and Flexible Framework for Nonconvex SGD with Client and
Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions.
We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism.
We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z) - Improved Approximation Algorithms for Individually Fair Clustering [9.914246432182873]
We present an improved $(16p +varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost.
Our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al.
arXiv Detail & Related papers (2021-06-26T15:22:52Z) - Better Algorithms for Individually Fair $k$-Clustering [4.936817539835045]
Jung, Kannan, and Lutz [FORC 2020] introduced a clustering algorithm with provable (approximate) fairness and objective guarantees for the $ell_infty$ or $k$-Center objective. Mahabadi and Vakilian [ICML 2020] revisited this problem to give a local-search algorithm for all $ell_p$-norms.
We prove that by modifying known LP rounding techniques, one gets a worst-case guarantee on the objective which is much better than in MV20, and empirically, this objective is extremely close to the optimal.
arXiv Detail & Related papers (2021-06-23T04:16:46Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Individual Fairness for $k$-Clustering [16.988236398003068]
Local search based algorithm for $k$-median and $k$-means.
We show how to get a bicriteria approximation for fair $k$-clustering.
arXiv Detail & Related papers (2020-02-17T02:31:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.