Learning Adversarially Robust Representations via Worst-Case Mutual
Information Maximization
- URL: http://arxiv.org/abs/2002.11798v2
- Date: Sun, 5 Jul 2020 15:18:54 GMT
- Title: Learning Adversarially Robust Representations via Worst-Case Mutual
Information Maximization
- Authors: Sicheng Zhu, Xiao Zhang, David Evans
- Abstract summary: Training machine learning models that are robust against adversarial inputs poses seemingly insurmountable challenges.
We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions.
We propose an unsupervised learning method for obtaining intrinsically robust representations by maximizing the worst-case mutual information.
- Score: 15.087280646796527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training machine learning models that are robust against adversarial inputs
poses seemingly insurmountable challenges. To better understand adversarial
robustness, we consider the underlying problem of learning robust
representations. We develop a notion of representation vulnerability that
captures the maximum change of mutual information between the input and output
distributions, under the worst-case input perturbation. Then, we prove a
theorem that establishes a lower bound on the minimum adversarial risk that can
be achieved for any downstream classifier based on its representation
vulnerability. We propose an unsupervised learning method for obtaining
intrinsically robust representations by maximizing the worst-case mutual
information between the input and output distributions. Experiments on
downstream classification tasks support the robustness of the representations
found using unsupervised learning with our training principle.
Related papers
- Representation Learning with Conditional Information Flow Maximization [29.36409607847339]
This paper proposes an information-theoretic representation learning framework, named conditional information flow.
It promotes learned representations have good feature uniformity and sufficient predictive ability.
Experiments show that the learned representations are more sufficient, robust and transferable.
arXiv Detail & Related papers (2024-06-08T16:19:18Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Disentangled Text Representation Learning with Information-Theoretic
Perspective for Adversarial Robustness [17.5771010094384]
Adversarial vulnerability remains a major obstacle to constructing reliable NLP systems.
Recent work argues the adversarial vulnerability of the model is caused by the non-robust features in supervised training.
In this paper, we tackle the adversarial challenge from the view of disentangled representation learning.
arXiv Detail & Related papers (2022-10-26T18:14:39Z) - Correlation Information Bottleneck: Towards Adapting Pretrained
Multimodal Models for Robust Visual Question Answering [63.87200781247364]
Correlation Information Bottleneck (CIB) seeks a tradeoff between compression and redundancy in representations.
We derive a tight theoretical upper bound for the mutual information between multimodal inputs and representations.
arXiv Detail & Related papers (2022-09-14T22:04:10Z) - Robust Transferable Feature Extractors: Learning to Defend Pre-Trained
Networks Against White Box Adversaries [69.53730499849023]
We show that adversarial examples can be successfully transferred to another independently trained model to induce prediction errors.
We propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE)
arXiv Detail & Related papers (2022-09-14T21:09:34Z) - Hybrid Generative-Contrastive Representation Learning [32.84066504783469]
We show that a transformer-based encoder-decoder architecture trained with both contrastive and generative losses can learn highly discriminative and robust representations without hurting the generative performance.
arXiv Detail & Related papers (2021-06-11T04:23:48Z) - Disambiguation of weak supervision with exponential convergence rates [88.99819200562784]
In supervised learning, data are annotated with incomplete yet discriminative information.
In this paper, we focus on partial labelling, an instance of weak supervision where, from a given input, we are given a set of potential targets.
We propose an empirical disambiguation algorithm to recover full supervision from weak supervision.
arXiv Detail & Related papers (2021-02-04T18:14:32Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z) - Derivation of Information-Theoretically Optimal Adversarial Attacks with
Applications to Robust Machine Learning [11.206758778146288]
We consider the theoretical problem of designing an optimal adversarial attack on a decision system.
We present derivations of the optimal adversarial attacks for discrete and continuous signals of interest.
We show that it is much harder to achieve adversarial attacks for minimizing mutual information when multiple redundant copies of the input signal are available.
arXiv Detail & Related papers (2020-07-28T07:45:25Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.