Quantifying and Understanding Adversarial Examples in Discrete Input
Spaces
- URL: http://arxiv.org/abs/2112.06276v1
- Date: Sun, 12 Dec 2021 16:44:09 GMT
- Title: Quantifying and Understanding Adversarial Examples in Discrete Input
Spaces
- Authors: Volodymyr Kuleshov, Evgenii Nikishin, Shantanu Thakoor, Tingfung Lau,
Stefano Ermon
- Abstract summary: We formalize a notion of synonymous adversarial examples that applies in any discrete setting and describe a simple domain-agnostic algorithm to construct such examples.
Our work is a step towards a domain-agnostic treatment of discrete adversarial examples analogous to that of continuous inputs.
- Score: 70.18815080530801
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern classification algorithms are susceptible to adversarial
examples--perturbations to inputs that cause the algorithm to produce
undesirable behavior. In this work, we seek to understand and extend
adversarial examples across domains in which inputs are discrete, particularly
across new domains, such as computational biology. As a step towards this goal,
we formalize a notion of synonymous adversarial examples that applies in any
discrete setting and describe a simple domain-agnostic algorithm to construct
such examples. We apply this algorithm across multiple domains--including
sentiment analysis and DNA sequence classification--and find that it
consistently uncovers adversarial examples. We seek to understand their
prevalence theoretically and we attribute their existence to spurious token
correlations, a statistical phenomenon that is specific to discrete spaces. Our
work is a step towards a domain-agnostic treatment of discrete adversarial
examples analogous to that of continuous inputs.
Related papers
- Domain-Class Correlation Decomposition for Generalizable Person
Re-Identification [34.813965300584776]
In person re-identification, the domain and class are correlated.
We show that domain adversarial learning will lose certain information about class due to this domain-class correlation.
Our model outperforms the state-of-the-art methods on the large-scale domain generalization Re-ID benchmark.
arXiv Detail & Related papers (2021-06-29T09:45:03Z) - Generating Contrastive Explanations for Inductive Logic Programming
Based on a Near Miss Approach [0.7734726150561086]
We introduce an explanation generation algorithm for relational concepts learned with Inductive Logic Programming (textscGeNME)
A modified rule which covers the near miss but not the original instance is given as an explanation.
We also present a psychological experiment comparing human preferences of rule-based, example-based, and near miss explanations in the family and the arches domains.
arXiv Detail & Related papers (2021-06-15T11:42:05Z) - A Bit More Bayesian: Domain-Invariant Learning with Uncertainty [111.22588110362705]
Domain generalization is challenging due to the domain shift and the uncertainty caused by the inaccessibility of target domain data.
In this paper, we address both challenges with a probabilistic framework based on variational Bayesian inference.
We derive domain-invariant representations and classifiers, which are jointly established in a two-layer Bayesian neural network.
arXiv Detail & Related papers (2021-05-09T21:33:27Z) - Adversarial Examples in Constrained Domains [29.137629314003423]
We investigate whether constrained domains are less vulnerable than unconstrained domains to adversarial example generation algorithms.
Our approaches generate misclassification rates in constrained domains that were comparable to those of unconstrained domains.
Our investigation shows that the narrow attack surface exposed by constrained domains is still sufficiently large to craft successful adversarial examples.
arXiv Detail & Related papers (2020-11-02T18:19:44Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z) - Learning explanations that are hard to vary [75.30552491694066]
We show that averaging across examples can favor memorization and patchwork' solutions that sew together different strategies.
We then propose and experimentally validate a simple alternative algorithm based on a logical AND.
arXiv Detail & Related papers (2020-09-01T10:17:48Z) - Representation via Representations: Domain Generalization via
Adversarially Learned Invariant Representations [14.751829773340537]
We examine adversarial censoring techniques for learning invariant representations from multiple "studies" (or domains)
In many contexts, such as medical forecasting, domain generalization from studies in populous areas provides fairness of a different flavor, not anticipated in previous work on algorithmic fairness.
arXiv Detail & Related papers (2020-06-20T02:35:03Z) - Self-training Avoids Using Spurious Features Under Domain Shift [54.794607791641745]
In unsupervised domain adaptation, conditional entropy minimization and pseudo-labeling work even when the domain shifts are much larger than those analyzed by existing theory.
We identify and analyze one particular setting where the domain shift can be large, but certain spurious features correlate with label in the source domain but are independent label in the target.
arXiv Detail & Related papers (2020-06-17T17:51:42Z) - Domain Knowledge Alleviates Adversarial Attacks in Multi-Label
Classifiers [34.526394646264734]
Adversarial attacks on machine learning-based classifiers, along with defense mechanisms, have been widely studied.
In this paper, we shift the attention to multi-label classification, where the availability of domain knowledge may offer a natural way to spot incoherent predictions.
We explore this intuition in a framework in which first-order logic knowledge is converted into constraints and injected into a semi-supervised learning problem.
arXiv Detail & Related papers (2020-06-06T10:24:54Z) - Differential Treatment for Stuff and Things: A Simple Unsupervised
Domain Adaptation Method for Semantic Segmentation [105.96860932833759]
State-of-the-art approaches prove that performing semantic-level alignment is helpful in tackling the domain shift issue.
We propose to improve the semantic-level alignment with different strategies for stuff regions and for things.
In addition to our proposed method, we show that our method can help ease this issue by minimizing the most similar stuff and instance features between the source and the target domains.
arXiv Detail & Related papers (2020-03-18T04:43:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.