On the Robustness of Deep Clustering Models: Adversarial Attacks and
Defenses
- URL: http://arxiv.org/abs/2210.01940v1
- Date: Tue, 4 Oct 2022 22:32:02 GMT
- Title: On the Robustness of Deep Clustering Models: Adversarial Attacks and
Defenses
- Authors: Anshuman Chhabra, Ashwin Sekhari, Prasant Mohapatra
- Abstract summary: Clustering models constitute a class of unsupervised machine learning methods which are used in a number of application pipelines.
We propose a blackbox attack using Generative Adversarial Networks (GANs) where the adversary does not know which deep clustering model is being used.
We analyze our attack against multiple state-of-the-art deep clustering models and real-world datasets, and find that it is highly successful.
- Score: 14.951655356042947
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustering models constitute a class of unsupervised machine learning methods
which are used in a number of application pipelines, and play a vital role in
modern data science. With recent advancements in deep learning -- deep
clustering models have emerged as the current state-of-the-art over traditional
clustering approaches, especially for high-dimensional image datasets. While
traditional clustering approaches have been analyzed from a robustness
perspective, no prior work has investigated adversarial attacks and robustness
for deep clustering models in a principled manner. To bridge this gap, we
propose a blackbox attack using Generative Adversarial Networks (GANs) where
the adversary does not know which deep clustering model is being used, but can
query it for outputs. We analyze our attack against multiple state-of-the-art
deep clustering models and real-world datasets, and find that it is highly
successful. We then employ some natural unsupervised defense approaches, but
find that these are unable to mitigate our attack. Finally, we attack Face++, a
production-level face clustering API service, and find that we can
significantly reduce its performance as well. Through this work, we thus aim to
motivate the need for truly robust deep clustering models.
Related papers
- MEAOD: Model Extraction Attack against Object Detectors [45.817537875368956]
Model extraction attacks allow attackers to replicate a substitute model with comparable functionality to the victim model.
We propose an effective attack method called MEAOD for object detection models.
We achieve an extraction performance of over 70% under the given condition of a 10k query budget.
arXiv Detail & Related papers (2023-12-22T13:28:50Z) - OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable
Evasion Attacks [17.584752814352502]
Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data.
We introduce a self-supervised, computationally economical method for generating adversarial examples.
Our experiments consistently demonstrate the method is effective across various models, unseen data categories, and even defended models.
arXiv Detail & Related papers (2023-10-05T17:34:47Z) - A Deep Dive into Deep Cluster [0.2578242050187029]
DeepCluster is a simple and scalable unsupervised pretraining of visual representations.
We show that DeepCluster convergence and performance depend on the interplay between the quality of the randomly filters of the convolutional layer and the selected number of clusters.
arXiv Detail & Related papers (2022-07-24T22:55:09Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Model Extraction Attacks on Graph Neural Networks: Taxonomy and
Realization [40.37373934201329]
We investigate and develop model extraction attacks against GNN models.
We first formalise the threat modelling in the context of GNN model extraction.
We then present detailed methods which utilise the accessible knowledge in each threat to implement the attacks.
arXiv Detail & Related papers (2020-10-24T03:09:37Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Orthogonal Deep Models As Defense Against Black-Box Attacks [71.23669614195195]
We study the inherent weakness of deep models in black-box settings where the attacker may develop the attack using a model similar to the targeted model.
We introduce a novel gradient regularization scheme that encourages the internal representation of a deep model to be orthogonal to another.
We verify the effectiveness of our technique on a variety of large-scale models.
arXiv Detail & Related papers (2020-06-26T08:29:05Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.