Fair Correlation Clustering
- URL: http://arxiv.org/abs/2002.02274v2
- Date: Mon, 2 Mar 2020 22:27:51 GMT
- Title: Fair Correlation Clustering
- Authors: Sara Ahmadian, Alessandro Epasto, Ravi Kumar, Mohammad Mahdian
- Abstract summary: We obtain approximation algorithms for correlation clustering under several important types of fairness constraints.
We show that fair solutions to correlation clustering can be obtained with limited increase in cost compared to the state-of-the-art (unfair) algorithms.
- Score: 92.15492066925977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study correlation clustering under fairness constraints.
Fair variants of $k$-median and $k$-center clustering have been studied
recently, and approximation algorithms using a notion called fairlet
decomposition have been proposed. We obtain approximation algorithms for fair
correlation clustering under several important types of fairness constraints.
Our results hinge on obtaining a fairlet decomposition for correlation
clustering by introducing a novel combinatorial optimization problem. We define
a fairlet decomposition with cost similar to the $k$-median cost and this
allows us to obtain approximation algorithms for a wide range of fairness
constraints.
We complement our theoretical results with an in-depth analysis of our
algorithms on real graphs where we show that fair solutions to correlation
clustering can be obtained with limited increase in cost compared to the
state-of-the-art (unfair) algorithms.
Related papers
- A Semidefinite Relaxation Approach for Fair Graph Clustering [1.03590082373586]
This study introduces fair graph clustering within the framework of the disparate impact doctrine.
We employ a semidefinite relaxation approach to approximate the underlying optimization problem.
arXiv Detail & Related papers (2024-10-19T22:51:24Z) - Optimal Group Fair Classifiers from Linear Post-Processing [10.615965454674901]
We propose a post-processing algorithm for fair classification that mitigates model bias under a unified family of group fairness criteria.
It achieves fairness by re-calibrating the output score of the given base model with a "fairness cost" -- a linear combination of the (predicted) group memberships.
arXiv Detail & Related papers (2024-05-07T05:58:44Z) - Near-Optimal Algorithms for Constrained k-Center Clustering with Instance-level Background Knowledge [12.808663917871888]
We build on widely adopted $k$-center clustering and model its input background knowledge as must-link (ML) and cannot-link (CL) constraint sets.
We arrive at the first efficient approximation algorithm for constrained $k$-center with the best possible ratio of 2.
arXiv Detail & Related papers (2024-01-23T07:16:32Z) - Differentially-Private Hierarchical Clustering with Provable
Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering.
We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset.
We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points.
We provide implementable differentially private clustering algorithms that provide utility when the data is "easy"
We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z) - Faster Deterministic Approximation Algorithms for Correlation Clustering
and Cluster Deletion [5.584060970507507]
Correlation clustering is a framework for partitioning datasets based on pairwise similarity and dissimilarity scores.
In this paper we prove new relationships between correlation clustering problems and edge labeling problems related to the principle of strong partitioningic closure.
We develop new approximation algorithms for correlation clustering that have deterministic constant factor approximation guarantees and avoid the canonical linear programming relaxation.
arXiv Detail & Related papers (2021-11-20T22:47:19Z) - A Maximal Correlation Approach to Imposing Fairness in Machine Learning [25.773384159810234]
We explore the problem of algorithmic fairness, taking an information-theoretic view.
The maximal correlation framework is introduced for expressing fairness constraints and shown to be capable of being used to derive regularizers that enforce independence and separation-based fairness criteria.
We show that these algorithms provide smooth performance-fairness tradeoff curves and perform competitively with state-of-the-art methods on both discrete datasets (COMPAS, Adult) and continuous datasets (Communities and Crimes)
arXiv Detail & Related papers (2020-12-30T18:15:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.