One Neuron to Fool Them All
- URL: http://arxiv.org/abs/2003.09372v2
- Date: Tue, 9 Jun 2020 04:35:30 GMT
- Title: One Neuron to Fool Them All
- Authors: Anshuman Suri and David Evans
- Abstract summary: We evaluate the sensitivity of individual neurons in terms of how robust the model's output is to direct perturbations of that neuron's output.
Attacks using a loss function that targets just a single sensitive neuron find adversarial examples nearly as effectively as ones that target the full model.
- Score: 12.107259467873094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite vast research in adversarial examples, the root causes of model
susceptibility are not well understood. Instead of looking at attack-specific
robustness, we propose a notion that evaluates the sensitivity of individual
neurons in terms of how robust the model's output is to direct perturbations of
that neuron's output. Analyzing models from this perspective reveals
distinctive characteristics of standard as well as adversarially-trained robust
models, and leads to several curious results. In our experiments on CIFAR-10
and ImageNet, we find that attacks using a loss function that targets just a
single sensitive neuron find adversarial examples nearly as effectively as ones
that target the full model. We analyze the properties of these sensitive
neurons to propose a regularization term that can help a model achieve
robustness to a variety of different perturbation constraints while maintaining
accuracy on natural data distributions. Code for all our experiments is
available at https://github.com/iamgroot42/sauron .
Related papers
- Modeling dynamic neural activity by combining naturalistic video stimuli and stimulus-independent latent factors [5.967290675400836]
We propose a probabilistic model that incorporates video inputs along with stimulus-independent latent factors to capture variability in neuronal responses.
After training and testing our model on mouse V1 neuronal responses, we found that it outperforms video-only models in terms of log-likelihood.
We find that the learned latent factors strongly correlate with mouse behavior, although the model was trained without behavior data.
arXiv Detail & Related papers (2024-10-21T16:01:39Z) - The Surprising Harmfulness of Benign Overfitting for Adversarial
Robustness [13.120373493503772]
We prove a surprising result that even if the ground truth itself is robust to adversarial examples, the benignly overfitted model is benign in terms of the standard'' out-of-sample risk objective.
Our finding provides theoretical insights into the puzzling phenomenon observed in practice, where the true target function (e.g., human) is robust against adverasrial attack, while beginly overfitted neural networks lead to models that are not robust.
arXiv Detail & Related papers (2024-01-19T15:40:46Z) - Neural Frailty Machine: Beyond proportional hazard assumption in neural
survival regressions [30.018173329118184]
We present neural frailty machine (NFM), a powerful and flexible neural modeling framework for survival regressions.
Two concrete models are derived under the framework that extends neural proportional hazard models and non hazard regression models.
We conduct experimental evaluations over $6$ benchmark datasets of different scales, showing that the proposed NFM models outperform state-of-the-art survival models in terms of predictive performance.
arXiv Detail & Related papers (2023-03-18T08:15:15Z) - Improving Adversarial Transferability via Neuron Attribution-Based
Attacks [35.02147088207232]
We propose the Neuron-based Attack (NAA), which conducts feature-level attacks with more accurate neuron importance estimations.
We derive an approximation scheme of neuron attribution to tremendously reduce the overhead.
Experiments confirm the superiority of our approach to the state-of-the-art benchmarks.
arXiv Detail & Related papers (2022-03-31T13:47:30Z) - Few-shot Backdoor Defense Using Shapley Estimation [123.56934991060788]
We develop a new approach called Shapley Pruning to mitigate backdoor attacks on deep neural networks.
ShapPruning identifies the few infected neurons (under 1% of all neurons) and manages to protect the model's structure and accuracy.
Experiments demonstrate the effectiveness and robustness of our method against various attacks and tasks.
arXiv Detail & Related papers (2021-12-30T02:27:03Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Supervised Autoencoders Learn Robust Joint Factor Models of Neural
Activity [2.8402080392117752]
neuroscience applications collect high-dimensional predictors' corresponding to brain activity in different regions along with behavioral outcomes.
Joint factor models for the predictors and outcomes are natural, but maximum likelihood estimates of these models can struggle in practice when there is model misspecification.
We propose an alternative inference strategy based on supervised autoencoders; rather than placing a probability distribution on the latent factors, we define them as an unknown function of the high-dimensional predictors.
arXiv Detail & Related papers (2020-04-10T19:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.