Doubly Constrained Fair Clustering
- URL: http://arxiv.org/abs/2305.19475v1
- Date: Wed, 31 May 2023 01:04:55 GMT
- Title: Doubly Constrained Fair Clustering
- Authors: John Dickerson, Seyed A. Esmaeili, Jamie Morgenstern, Claire Jie Zhang
- Abstract summary: Group Fairness (GF) and Diversity in Center Selection (DS) are two most prominent demographic representation fairness notions in clustering.
We show that given a constant approximation algorithm for one constraint (GF or DS only) we can obtain a constant approximation solution that satisfies both constraints simultaneously.
We show that both GF and DS are incompatible with a collection of other distance-based fairness notions.
- Score: 11.11449872883085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The remarkable attention which fair clustering has received in the last few
years has resulted in a significant number of different notions of fairness.
Despite the fact that these notions are well-justified, they are often
motivated and studied in a disjoint manner where one fairness desideratum is
considered exclusively in isolation from the others. This leaves the
understanding of the relations between different fairness notions as an
important open problem in fair clustering. In this paper, we take the first
step in this direction. Specifically, we consider the two most prominent
demographic representation fairness notions in clustering: (1) Group Fairness
(GF), where the different demographic groups are supposed to have close to
population-level representation in each cluster and (2) Diversity in Center
Selection (DS), where the selected centers are supposed to have close to
population-level representation of each group. We show that given a constant
approximation algorithm for one constraint (GF or DS only) we can obtain a
constant approximation solution that satisfies both constraints simultaneously.
Interestingly, we prove that any given solution that satisfies the GF
constraint can always be post-processed at a bounded degradation to the
clustering cost to additionally satisfy the DS constraint while the reverse is
not true. Furthermore, we show that both GF and DS are incompatible (having an
empty feasibility set in the worst case) with a collection of other
distance-based fairness notions. Finally, we carry experiments to validate our
theoretical findings.
Related papers
- The Fairness-Quality Trade-off in Clustering [2.797813058568041]
We introduce novel algorithms for tracing the complete trade-off curve between quality and fairness in clustering problems.
We deal with all objectives for fairness and quality in two general classes encompassing most of special cases addressed in previous work.
We also present a new most natural-time objective: minimizing the sum, over all clusters, of the imbalance between the two groups in each cluster.
arXiv Detail & Related papers (2024-08-19T13:57:15Z) - Towards Distribution-Agnostic Generalized Category Discovery [51.52673017664908]
Data imbalance and open-ended distribution are intrinsic characteristics of the real visual world.
We propose a Self-Balanced Co-Advice contrastive framework (BaCon)
BaCon consists of a contrastive-learning branch and a pseudo-labeling branch, working collaboratively to provide interactive supervision to resolve the DA-GCD task.
arXiv Detail & Related papers (2023-10-02T17:39:58Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - Fair Correlation Clustering in Forests [8.810926150873991]
A clustering is said to be fair, if each cluster has the same distribution of manifestations of a sensitive attribute as the whole input set.
This is motivated by various applications where the objects to be clustered have sensitive attributes that should not be over- or underrepresented.
We consider restricted graph classes which allow us to characterize the distributions of sensitive attributes for which this form of fairness is tractable.
arXiv Detail & Related papers (2023-02-22T11:27:06Z) - How Robust is Your Fairness? Evaluating and Sustaining Fairness under
Unseen Distribution Shifts [107.72786199113183]
We propose a novel fairness learning method termed CUrvature MAtching (CUMA)
CUMA achieves robust fairness generalizable to unseen domains with unknown distributional shifts.
We evaluate our method on three popular fairness datasets.
arXiv Detail & Related papers (2022-07-04T02:37:50Z) - Fair Labeled Clustering [28.297893914525517]
We consider the downstream application of clustering and how group fairness should be ensured for such a setting.
We provide algorithms for such problems and show that in contrast to their NP-hard counterparts in group fair clustering, they permit efficient solutions.
We also consider a well-motivated alternative setting where the decision-maker is free to assign labels to the clusters regardless of the centers' positions in the metric space.
arXiv Detail & Related papers (2022-05-28T07:07:12Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - A Boundary Based Out-of-Distribution Classifier for Generalized
Zero-Shot Learning [83.1490247844899]
Generalized Zero-Shot Learning (GZSL) is a challenging topic that has promising prospects in many realistic scenarios.
We propose a boundary based Out-of-Distribution (OOD) classifier which classifies the unseen and seen domains by only using seen samples for training.
We extensively validate our approach on five popular benchmark datasets including AWA1, AWA2, CUB, FLO and SUN.
arXiv Detail & Related papers (2020-08-09T11:27:19Z) - A Pairwise Fair and Community-preserving Approach to k-Center Clustering [34.386585230600716]
Clustering is a foundational problem in machine learning with numerous applications.
We define two new types of fairness in the clustering setting, pairwise fairness and community preservation.
arXiv Detail & Related papers (2020-07-14T22:32:27Z) - Distributional Individual Fairness in Clustering [7.303841123034983]
We introduce a framework for assigning individuals, embedded in a metric space, to probability distributions over a bounded number of cluster centers.
We provide an algorithm for clustering with $p$-norm objective and individual fairness constraints with provable approximation guarantee.
arXiv Detail & Related papers (2020-06-22T20:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.