Robust Fair Clustering with Group Membership Uncertainty Sets
- URL: http://arxiv.org/abs/2406.00599v1
- Date: Sun, 2 Jun 2024 03:11:31 GMT
- Title: Robust Fair Clustering with Group Membership Uncertainty Sets
- Authors: Sharmila Duppala, Juan Luque, John P. Dickerson, Seyed A. Esmaeili,
- Abstract summary: We study the canonical fair clustering problem where each cluster is constrained to have close to population-level representation of each group.
We introduce a simple and interpretable family of error models that require a small number of parameters to be given by the decision maker.
We then present an algorithm for fair clustering with provable robustness guarantees.
- Score: 31.29773979737976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the canonical fair clustering problem where each cluster is constrained to have close to population-level representation of each group. Despite significant attention, the salient issue of having incomplete knowledge about the group membership of each point has been superficially addressed. In this paper, we consider a setting where errors exist in the assigned group memberships. We introduce a simple and interpretable family of error models that require a small number of parameters to be given by the decision maker. We then present an algorithm for fair clustering with provable robustness guarantees. Our framework enables the decision maker to trade off between the robustness and the clustering quality. Unlike previous work, our algorithms are backed by worst-case theoretical guarantees. Finally, we empirically verify the performance of our algorithm on real world datasets and show its superior performance over existing baselines.
Related papers
- Fuzzy K-Means Clustering without Cluster Centroids [79.19713746387337]
Fuzzy K-Means clustering is a critical computation technique in unsupervised data analysis.
This paper proposes a novel Fuzzy K-Means clustering algorithm that entirely eliminates the reliance on cluster centroids.
arXiv Detail & Related papers (2024-04-07T12:25:03Z) - Cluster-level Group Representativity Fairness in $k$-means Clustering [3.420467786581458]
Clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters.
We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms.
We show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.
arXiv Detail & Related papers (2022-12-29T22:02:28Z) - Improved Approximation for Fair Correlation Clustering [4.629694186457133]
Correlation clustering is a ubiquitous paradigm in unsupervised machine learning where addressing unfairness is a major challenge.
Motivated by this, we study Fair Correlation Clustering where the data points may belong to different protected groups.
Our paper significantly generalizes and improves on the quality guarantees of previous work of Ahmadi et al. and Ahmadian et al.
arXiv Detail & Related papers (2022-06-09T03:07:57Z) - Fair Labeled Clustering [28.297893914525517]
We consider the downstream application of clustering and how group fairness should be ensured for such a setting.
We provide algorithms for such problems and show that in contrast to their NP-hard counterparts in group fair clustering, they permit efficient solutions.
We also consider a well-motivated alternative setting where the decision-maker is free to assign labels to the clusters regardless of the centers' positions in the metric space.
arXiv Detail & Related papers (2022-05-28T07:07:12Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Focus on the Common Good: Group Distributional Robustness Follows [47.62596240492509]
This paper proposes a new and simple algorithm that explicitly encourages learning of features that are shared across various groups.
While Group-DRO focuses on groups with worst regularized loss, focusing instead, on groups that enable better performance even on other groups, could lead to learning of shared/common features.
arXiv Detail & Related papers (2021-10-06T09:47:41Z) - Feature-based Individual Fairness in k-Clustering [14.847868952138795]
We consider the problem of clustering a set of points while ensuring fairness constraints.
We introduce a new notion of individual fairness in k-clustering based on features that are not necessarily used for clustering.
arXiv Detail & Related papers (2021-09-09T20:42:02Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Probabilistic Fair Clustering [31.628993679745292]
Prior work in fair clustering assumes complete knowledge of group membership.
We present clustering algorithms in this more general setting with approximation ratio guarantees.
We also address the problem of "metric membership", where different groups have a notion of order and distance.
arXiv Detail & Related papers (2020-06-19T01:34:21Z) - Learning to Cluster Faces via Confidence and Connectivity Estimation [136.5291151775236]
We propose a fully learnable clustering framework without requiring a large number of overlapped subgraphs.
Our method significantly improves clustering accuracy and thus performance of the recognition models trained on top, yet it is an order of magnitude more efficient than existing supervised methods.
arXiv Detail & Related papers (2020-04-01T13:39:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.