Fairness Degrading Adversarial Attacks Against Clustering Algorithms
- URL: http://arxiv.org/abs/2110.12020v1
- Date: Fri, 22 Oct 2021 19:10:27 GMT
- Title: Fairness Degrading Adversarial Attacks Against Clustering Algorithms
- Authors: Anshuman Chhabra, Adish Singla, Prasant Mohapatra
- Abstract summary: We propose a fairness degrading attack algorithm for k-median clustering.
We find that the addition of the generated adversarial samples can lead to significantly lower fairness values.
- Score: 35.40427659749882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustering algorithms are ubiquitous in modern data science pipelines, and
are utilized in numerous fields ranging from biology to facility location. Due
to their widespread use, especially in societal resource allocation problems,
recent research has aimed at making clustering algorithms fair, with great
success. Furthermore, it has also been shown that clustering algorithms, much
like other machine learning algorithms, are susceptible to adversarial attacks
where a malicious entity seeks to subvert the performance of the learning
algorithm. However, despite these known vulnerabilities, there has been no
research undertaken that investigates fairness degrading adversarial attacks
for clustering. We seek to bridge this gap by formulating a generalized attack
optimization problem aimed at worsening the group-level fairness of
centroid-based clustering algorithms. As a first step, we propose a fairness
degrading attack algorithm for k-median clustering that operates under a
whitebox threat model -- where the clustering algorithm, fairness notion, and
the input dataset are known to the adversary. We provide empirical results as
well as theoretical analysis for our simple attack algorithm, and find that the
addition of the generated adversarial samples can lead to significantly lower
fairness values. In this manner, we aim to motivate fairness degrading
adversarial attacks as a direction for future research in fair clustering.
Related papers
- GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - A Rapid Review of Clustering Algorithms [5.46715422237599]
Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data.
They play an important role in today's life, such as in marketing and e-commerce, healthcare, data organization and analysis, and social media.
We analyzed existing clustering algorithms and classify mainstream algorithms across five different dimensions.
arXiv Detail & Related papers (2024-01-14T23:19:53Z) - Adversarial Training Should Be Cast as a Non-Zero-Sum Game [121.95628660889628]
Two-player zero-sum paradigm of adversarial training has not engendered sufficient levels of robustness.
We show that the commonly used surrogate-based relaxation used in adversarial training algorithms voids all guarantees on robustness.
A novel non-zero-sum bilevel formulation of adversarial training yields a framework that matches and in some cases outperforms state-of-the-art attacks.
arXiv Detail & Related papers (2023-06-19T16:00:48Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Adversarial Robustness of Streaming Algorithms through Importance
Sampling [29.957317489789745]
We introduce adversarially robust streaming algorithms for central machine learning and algorithmic tasks.
Our results are based on a simple, but powerful, observation that many importance sampling-based algorithms give rise to adversarial robustness.
We empirically confirm the robustness of our algorithms on various adversarial attacks and demonstrate that by contrast, some common existing algorithms are not robust.
arXiv Detail & Related papers (2021-06-28T19:24:11Z) - Fair Clustering Using Antidote Data [35.40427659749882]
We propose an alternate approach to fairness in clustering where we augment the original dataset with a small number of data points, called antidote data.
Our algorithms achieve lower fairness costs and competitive clustering performance compared to other state-of-the-art fair clustering algorithms.
arXiv Detail & Related papers (2021-06-01T16:07:52Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z) - Fair Hierarchical Clustering [92.03780518164108]
We define a notion of fairness that mitigates over-representation in traditional clustering.
We show that our algorithms can find a fair hierarchical clustering, with only a negligible loss in the objective.
arXiv Detail & Related papers (2020-06-18T01:05:11Z) - Fair Algorithms for Hierarchical Agglomerative Clustering [17.66340013352806]
Hierarchical Agglomerative Clustering (HAC) algorithms are extensively utilized in modern data science.
It is imperative to ensure that these algorithms are fair -- even if the dataset contains biases against certain protected groups.
We propose fair algorithms for performing HAC that enforce fairness constraints.
arXiv Detail & Related papers (2020-05-07T01:41:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.