Related papers: Exploring Adversarial Examples for Efficient Active Learning in Machine Learning Classifiers

Exploring Adversarial Examples for Efficient Active Learning in Machine Learning Classifiers

URL: http://arxiv.org/abs/2109.10770v2
Date: Thu, 23 Sep 2021 04:08:59 GMT
Title: Exploring Adversarial Examples for Efficient Active Learning in Machine Learning Classifiers
Authors: Honggang Yu, Shihfeng Zeng, Teng Zhang, Ing-Chao Lin, Yier Jin
Abstract summary: We first add particular perturbation to original training examples using adversarial attack methods. We then investigate the connections between active learning and these particular training examples. Results show that the established theoretical foundation will guide better active learning strategies based on adversarial examples.
Score: 17.90617023533039
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning researchers have long noticed the phenomenon that the model training process will be more effective and efficient when the training samples are densely sampled around the underlying decision boundary. While this observation has already been widely applied in a range of machine learning security techniques, it lacks theoretical analyses of the correctness of the observation. To address this challenge, we first add particular perturbation to original training examples using adversarial attack methods so that the generated examples could lie approximately on the decision boundary of the ML classifiers. We then investigate the connections between active learning and these particular training examples. Through analyzing various representative classifiers such as k-NN classifiers, kernel methods as well as deep neural networks, we establish a theoretical foundation for the observation. As a result, our theoretical proofs provide support to more efficient active learning methods with the help of adversarial examples, contrary to previous works where adversarial examples are often used as destructive solutions. Experimental results show that the established theoretical foundation will guide better active learning strategies based on adversarial examples.

Related papers

Understanding Difficult-to-learn Examples in Contrastive Learning: A Theoretical Framework for Spectral Contrastive Learning [20.53618673620584]
Unsupervised contrastive learning has shown significant performance improvements in recent years, often approaching or even rivaling supervised learning in various tasks. Previous works have shown that difficult-to-learn examples, which are essential in supervised learning, contribute minimally in unsupervised settings. In this paper, we find that the direct removal of difficult-to-learn examples, although reduces the sample size, can boost the downstream classification performance of contrastive learning.
arXiv Detail & Related papers (2025-01-02T16:17:44Z)
Detecting Adversarial Examples [24.585379549997743]
We propose a novel method to detect adversarial examples by analyzing the layer outputs of Deep Neural Networks (DNNs)<n>Our method trains a lightweight regression model that predicts deeper-layer features from early-layer features, and uses the prediction error to detect adversarial samples.
arXiv Detail & Related papers (2024-10-22T21:42:59Z)
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory. We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z)
Robust Transferable Feature Extractors: Learning to Defend Pre-Trained Networks Against White Box Adversaries [69.53730499849023]
We show that adversarial examples can be successfully transferred to another independently trained model to induce prediction errors. We propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE)
arXiv Detail & Related papers (2022-09-14T21:09:34Z)
On the Properties of Adversarially-Trained CNNs [4.769747792846005]
Adversarial Training has proved to be an effective training paradigm to enforce robustness against adversarial examples in modern neural network architectures. We describe surprising properties of adversarially-trained models, shedding light on mechanisms through which robustness against adversarial attacks is implemented.
arXiv Detail & Related papers (2022-03-17T11:11:52Z)
Deep Active Learning by Leveraging Training Dynamics [57.95155565319465]
We propose a theory-driven deep active learning method (dynamicAL) which selects samples to maximize training dynamics. We show that dynamicAL not only outperforms other baselines consistently but also scales well on large deep learning models.
arXiv Detail & Related papers (2021-10-16T16:51:05Z)
TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions. Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
Adversarial Examples for Unsupervised Machine Learning Models [71.81480647638529]
Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models. We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
arXiv Detail & Related papers (2021-03-02T17:47:58Z)
Using Cross-Loss Influence Functions to Explain Deep Network Representations [1.7778609937758327]
We show that influence functions can be extended to handle mismatched training and testing settings. Our result enables us to compute the influence of unsupervised and self-supervised training examples with respect to a supervised test objective.
arXiv Detail & Related papers (2020-12-03T03:43:26Z)
On the Transferability of Adversarial Attacksagainst Neural Text Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models. We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models. We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)
Improving Adversarial Robustness by Enforcing Local and Global Compactness [19.8818435601131]
Adversary training is the most successful method that consistently resists a wide range of attacks. We propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption. The experimental results demonstrate that augmenting adversarial training with our proposed components can further improve the robustness of the network.
arXiv Detail & Related papers (2020-07-10T00:43:06Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.