Fair Clustering: Critique, Caveats, and Future Directions
- URL: http://arxiv.org/abs/2406.15960v1
- Date: Sat, 22 Jun 2024 23:34:53 GMT
- Title: Fair Clustering: Critique, Caveats, and Future Directions
- Authors: John Dickerson, Seyed A. Esmaeili, Jamie Morgenstern, Claire Jie Zhang,
- Abstract summary: Clustering is a fundamental problem in machine learning and operations research.
We take a critical view of fair clustering, identifying a collection of ignored issues.
- Score: 11.077625489695922
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustering is a fundamental problem in machine learning and operations research. Therefore, given the fact that fairness considerations have become of paramount importance in algorithm design, fairness in clustering has received significant attention from the research community. The literature on fair clustering has resulted in a collection of interesting fairness notions and elaborate algorithms. In this paper, we take a critical view of fair clustering, identifying a collection of ignored issues such as the lack of a clear utility characterization and the difficulty in accounting for the downstream effects of a fair clustering algorithm in machine learning settings. In some cases, we demonstrate examples where the application of a fair clustering algorithm can have significant negative impacts on social welfare. We end by identifying a collection of steps that would lead towards more impactful research in fair clustering.
Related papers
- Fair Correlation Clustering in Forests [8.810926150873991]
A clustering is said to be fair, if each cluster has the same distribution of manifestations of a sensitive attribute as the whole input set.
This is motivated by various applications where the objects to be clustered have sensitive attributes that should not be over- or underrepresented.
We consider restricted graph classes which allow us to characterize the distributions of sensitive attributes for which this form of fairness is tractable.
arXiv Detail & Related papers (2023-02-22T11:27:06Z) - Cluster-level Group Representativity Fairness in $k$-means Clustering [3.420467786581458]
Clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters.
We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms.
We show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.
arXiv Detail & Related papers (2022-12-29T22:02:28Z) - Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points.
We provide implementable differentially private clustering algorithms that provide utility when the data is "easy"
We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z) - Fairness Degrading Adversarial Attacks Against Clustering Algorithms [35.40427659749882]
We propose a fairness degrading attack algorithm for k-median clustering.
We find that the addition of the generated adversarial samples can lead to significantly lower fairness values.
arXiv Detail & Related papers (2021-10-22T19:10:27Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - MultiFair: Multi-Group Fairness in Machine Learning [52.24956510371455]
We study multi-group fairness in machine learning (MultiFair)
We propose a generic end-to-end algorithmic framework to solve it.
Our proposed framework is generalizable to many different settings.
arXiv Detail & Related papers (2021-05-24T02:30:22Z) - Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method.
Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features.
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z) - Learning to Generate Fair Clusters from Demonstrations [27.423983748614198]
We show how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert.
We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques.
We investigate how to generate interpretable solutions using our approach.
arXiv Detail & Related papers (2021-02-08T03:09:33Z) - Through the Data Management Lens: Experimental Analysis and Evaluation
of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness.
We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability.
Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z) - Whither Fair Clustering? [3.4925763160992402]
We argue that the state-of-the-art in fair clustering has been quite parochial in outlook.
We argue that widening the normative principles to target for, characterizing shortfalls where the target cannot be achieved fully, and making use of knowledge of downstream processes can significantly widen the scope of research in fair clustering research.
arXiv Detail & Related papers (2020-07-08T19:41:25Z) - Fair Hierarchical Clustering [92.03780518164108]
We define a notion of fairness that mitigates over-representation in traditional clustering.
We show that our algorithms can find a fair hierarchical clustering, with only a negligible loss in the objective.
arXiv Detail & Related papers (2020-06-18T01:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.