Related papers: Quantifying and Understanding Adversarial Examples in Discrete Input Spaces

Quantifying and Understanding Adversarial Examples in Discrete Input Spaces

URL: http://arxiv.org/abs/2112.06276v1
Date: Sun, 12 Dec 2021 16:44:09 GMT
Title: Quantifying and Understanding Adversarial Examples in Discrete Input Spaces
Authors: Volodymyr Kuleshov, Evgenii Nikishin, Shantanu Thakoor, Tingfung Lau, Stefano Ermon
Abstract summary: We formalize a notion of synonymous adversarial examples that applies in any discrete setting and describe a simple domain-agnostic algorithm to construct such examples. Our work is a step towards a domain-agnostic treatment of discrete adversarial examples analogous to that of continuous inputs.
Score: 70.18815080530801
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern classification algorithms are susceptible to adversarial examples--perturbations to inputs that cause the algorithm to produce undesirable behavior. In this work, we seek to understand and extend adversarial examples across domains in which inputs are discrete, particularly across new domains, such as computational biology. As a step towards this goal, we formalize a notion of synonymous adversarial examples that applies in any discrete setting and describe a simple domain-agnostic algorithm to construct such examples. We apply this algorithm across multiple domains--including sentiment analysis and DNA sequence classification--and find that it consistently uncovers adversarial examples. We seek to understand their prevalence theoretically and we attribute their existence to spurious token correlations, a statistical phenomenon that is specific to discrete spaces. Our work is a step towards a domain-agnostic treatment of discrete adversarial examples analogous to that of continuous inputs.

Related papers

A Unified Analysis of Generalization and Sample Complexity for Semi-Supervised Domain Adaptation [1.9567015559455132]
Domain adaptation seeks to leverage the abundant label information in a source domain to improve classification performance in a target domain with limited labels.<n>Most existing theoretical analyses focus on simplified settings where the source and target domains share the same input space.<n>We present a comprehensive theoretical study of domain adaptation algorithms based on domain alignment.
arXiv Detail & Related papers (2025-07-30T12:53:08Z)
Domain-Class Correlation Decomposition for Generalizable Person Re-Identification [34.813965300584776]
In person re-identification, the domain and class are correlated. We show that domain adversarial learning will lose certain information about class due to this domain-class correlation. Our model outperforms the state-of-the-art methods on the large-scale domain generalization Re-ID benchmark.
arXiv Detail & Related papers (2021-06-29T09:45:03Z)
Generating Contrastive Explanations for Inductive Logic Programming Based on a Near Miss Approach [0.7734726150561086]
We introduce an explanation generation algorithm for relational concepts learned with Inductive Logic Programming (textscGeNME) A modified rule which covers the near miss but not the original instance is given as an explanation. We also present a psychological experiment comparing human preferences of rule-based, example-based, and near miss explanations in the family and the arches domains.
arXiv Detail & Related papers (2021-06-15T11:42:05Z)
A Bit More Bayesian: Domain-Invariant Learning with Uncertainty [111.22588110362705]
Domain generalization is challenging due to the domain shift and the uncertainty caused by the inaccessibility of target domain data. In this paper, we address both challenges with a probabilistic framework based on variational Bayesian inference. We derive domain-invariant representations and classifiers, which are jointly established in a two-layer Bayesian neural network.
arXiv Detail & Related papers (2021-05-09T21:33:27Z)
Adversarial Examples in Constrained Domains [29.137629314003423]
We investigate whether constrained domains are less vulnerable than unconstrained domains to adversarial example generation algorithms. Our approaches generate misclassification rates in constrained domains that were comparable to those of unconstrained domains. Our investigation shows that the narrow attack surface exposed by constrained domains is still sufficiently large to craft successful adversarial examples.
arXiv Detail & Related papers (2020-11-02T18:19:44Z)
A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms. We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z)
Learning explanations that are hard to vary [75.30552491694066]
We show that averaging across examples can favor memorization and patchwork' solutions that sew together different strategies. We then propose and experimentally validate a simple alternative algorithm based on a logical AND.
arXiv Detail & Related papers (2020-09-01T10:17:48Z)
Representation via Representations: Domain Generalization via Adversarially Learned Invariant Representations [14.751829773340537]
We examine adversarial censoring techniques for learning invariant representations from multiple "studies" (or domains) In many contexts, such as medical forecasting, domain generalization from studies in populous areas provides fairness of a different flavor, not anticipated in previous work on algorithmic fairness.
arXiv Detail & Related papers (2020-06-20T02:35:03Z)
Self-training Avoids Using Spurious Features Under Domain Shift [54.794607791641745]
In unsupervised domain adaptation, conditional entropy minimization and pseudo-labeling work even when the domain shifts are much larger than those analyzed by existing theory. We identify and analyze one particular setting where the domain shift can be large, but certain spurious features correlate with label in the source domain but are independent label in the target.
arXiv Detail & Related papers (2020-06-17T17:51:42Z)
Domain Knowledge Alleviates Adversarial Attacks in Multi-Label Classifiers [34.526394646264734]
Adversarial attacks on machine learning-based classifiers, along with defense mechanisms, have been widely studied. In this paper, we shift the attention to multi-label classification, where the availability of domain knowledge may offer a natural way to spot incoherent predictions. We explore this intuition in a framework in which first-order logic knowledge is converted into constraints and injected into a semi-supervised learning problem.
arXiv Detail & Related papers (2020-06-06T10:24:54Z)
Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation [105.96860932833759]
State-of-the-art approaches prove that performing semantic-level alignment is helpful in tackling the domain shift issue. We propose to improve the semantic-level alignment with different strategies for stuff regions and for things. In addition to our proposed method, we show that our method can help ease this issue by minimizing the most similar stuff and instance features between the source and the target domains.
arXiv Detail & Related papers (2020-03-18T04:43:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.