Spectral Normalized-Cut Graph Partitioning with Fairness Constraints
- URL: http://arxiv.org/abs/2307.12065v1
- Date: Sat, 22 Jul 2023 12:20:46 GMT
- Title: Spectral Normalized-Cut Graph Partitioning with Fairness Constraints
- Authors: Jia Li, Yanhao Wang, Arpit Merchant
- Abstract summary: Normalized-cut graph partitioning aims to divide the set of nodes in a graph into $k$ disjoint clusters to minimize the fraction of the total edges between any cluster and all other clusters.
We consider a fair variant of the partitioning problem wherein nodes are characterized by a categorical sensitive attribute indicating membership to different demographic groups.
Our goal is to ensure that each group is approximately proportionally represented in each cluster while minimizing the normalized cut value.
- Score: 18.835004555146575
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Normalized-cut graph partitioning aims to divide the set of nodes in a graph
into $k$ disjoint clusters to minimize the fraction of the total edges between
any cluster and all other clusters. In this paper, we consider a fair variant
of the partitioning problem wherein nodes are characterized by a categorical
sensitive attribute (e.g., gender or race) indicating membership to different
demographic groups. Our goal is to ensure that each group is approximately
proportionally represented in each cluster while minimizing the normalized cut
value. To resolve this problem, we propose a two-phase spectral algorithm
called FNM. In the first phase, we add an augmented Lagrangian term based on
our fairness criteria to the objective function for obtaining a fairer spectral
node embedding. Then, in the second phase, we design a rounding scheme to
produce $k$ clusters from the fair embedding that effectively trades off
fairness and partition quality. Through comprehensive experiments on nine
benchmark datasets, we demonstrate the superior performance of FNM compared
with three baseline methods.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - A Unified Framework for Gradient-based Clustering of Distributed Data [51.904327888475606]
We develop a family of distributed clustering algorithms that work over networks of users.
DGC-$mathcalF_rho$ is specialized to popular clustering losses like $K$-means and Huber loss.
We show that consensus fixed points of DGC-$mathcalF_rho$ are equivalent to fixed points of gradient clustering over the full data.
arXiv Detail & Related papers (2024-02-02T10:44:42Z) - A Unified Framework for Fair Spectral Clustering With Effective Graph
Learning [12.343382413705394]
We consider the problem of spectral clustering under group fairness constraints.
In practice, the graph is usually unknown, and we need to construct the underlying graph from potentially noisy data.
We propose a novel graph construction method with a node-adaptive graph filter to learn graphs from noisy data.
arXiv Detail & Related papers (2023-11-23T01:43:00Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - One-step Bipartite Graph Cut: A Normalized Formulation and Its
Application to Scalable Subspace Clustering [56.81492360414741]
We show how to enforce a one-step normalized cut for bipartite graphs, especially with linear-time complexity.
In this paper, we first characterize a novel one-step bipartite graph cut criterion with normalized constraints, and theoretically prove its equivalence to a trace problem.
We extend this cut criterion to a scalable subspace clustering approach, where adaptive anchor learning, bipartite graph learning, and one-step normalized bipartite graph partitioning are simultaneously modeled.
arXiv Detail & Related papers (2023-05-12T11:27:20Z) - Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF)
Our model directly considers the between-view relationship and exploits the between-view complementary information.
In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z) - Total Variation Graph Neural Networks [5.571369922847262]
Recently proposed Graph Neural Networks (GNNs) are trained with an unsupervised minimum cut objective.
We propose a GNN model that computes cluster assignments by optimizing a tighter relaxation of the minimum cut.
arXiv Detail & Related papers (2022-11-11T14:13:14Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - A Performance Guarantee for Spectral Clustering [0.6445605125467572]
We study the central question: when is spectral clustering able to find the global solution to the minimum ratio cut problem?
We develop a deterministic two-to-infinity norm bound for the the invariant subspace of the graph Laplacian that corresponds to the $k$ smallest eigenvalues.
By combining these two results we give a condition under which spectral clustering is guaranteed to output the global solution to the minimum ratio cut problem.
arXiv Detail & Related papers (2020-07-10T22:03:43Z) - Efficient Clustering for Stretched Mixtures: Landscape and Optimality [4.2111286819721485]
This paper considers a canonical clustering problem where one receives unlabeled samples drawn from a balanced mixture of two elliptical distributions.
We show that the non-optimal clustering function exhibits desirable geometric properties when the sample size exceeds some constant statistical objectives.
arXiv Detail & Related papers (2020-03-22T17:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.