Related papers: Cluster-level Group Representativity Fairness in $k$-means Clustering

Cluster-level Group Representativity Fairness in $k$-means Clustering

URL: http://arxiv.org/abs/2212.14467v1
Date: Thu, 29 Dec 2022 22:02:28 GMT
Title: Cluster-level Group Representativity Fairness in $k$-means Clustering
Authors: Stanley Simoes, Deepak P, Muiris MacCarthaigh
Abstract summary: Clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters. We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms. We show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.
Score: 3.420467786581458
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There has been much interest recently in developing fair clustering algorithms that seek to do justice to the representation of groups defined along sensitive attributes such as race and gender. We observe that clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters. We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms such as $k$-means, where we focus on mitigating the unfairness experienced by the most-disadvantaged group within each cluster. Our method uses an iterative optimisation paradigm whereby an initial cluster assignment is modified by reassigning objects to clusters such that the worst-off sensitive group within each cluster is benefitted. We demonstrate the effectiveness of our method through extensive empirical evaluations over a novel evaluation metric on real-world datasets. Specifically, we show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.

Related papers

Fair Clustering via Alignment [3.5845787949988592]
Algorithmic fairness in clustering aims to balance proportions of instances assigned to each cluster with respect to a given sensitive attribute.<n>We propose a new fair clustering algorithm based on a novel decomposition of the fair $K$-means clustering objective function.
arXiv Detail & Related papers (2025-05-14T04:29:09Z)
From A-to-Z Review of Clustering Validation Indices [4.08908337437878]
We review and evaluate the performance of internal and external clustering validation indices on the most common clustering algorithms. We suggest a classification framework for examining the functionality of both internal and external clustering validation measures.
arXiv Detail & Related papers (2024-07-18T13:52:02Z)
A3S: A General Active Clustering Method with Pairwise Constraints [66.74627463101837]
A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm. In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries.
arXiv Detail & Related papers (2024-07-14T13:37:03Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
DivClust: Controlling Diversity in Deep Clustering [47.85350249697335]
DivClust produces consensus clustering solutions that consistently outperform single-clustering baselines. Our method effectively controls diversity across frameworks and datasets with very small additional computational cost.
arXiv Detail & Related papers (2023-04-03T14:45:43Z)
Socially Fair Center-based and Linear Subspace Clustering [8.355270405285909]
Center-based clustering and linear subspace clustering are popular techniques to partition real-world data into smaller clusters. Different clustering cost per point for different sensitive groups can lead to fairness-related harms. We propose a unified framework to solve socially fair center-based clustering and linear subspace clustering.
arXiv Detail & Related papers (2022-08-22T07:10:17Z)
Fair Labeled Clustering [28.297893914525517]
We consider the downstream application of clustering and how group fairness should be ensured for such a setting. We provide algorithms for such problems and show that in contrast to their NP-hard counterparts in group fair clustering, they permit efficient solutions. We also consider a well-motivated alternative setting where the decision-maker is free to assign labels to the clusters regardless of the centers' positions in the metric space.
arXiv Detail & Related papers (2022-05-28T07:07:12Z)
You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation. We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one. By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z)
Deep Fair Discriminative Clustering [24.237000220172906]
We study a general notion of group-level fairness for binary and multi-state protected status variables (PSVs) We propose a refinement learning algorithm to combine the clustering goal with the fairness objective to learn fair clusters adaptively. Our framework shows promising results for novel clustering tasks including flexible fairness constraints, multi-state PSVs and predictive clustering.
arXiv Detail & Related papers (2021-05-28T23:50:48Z)
Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method. Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features. On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.