Tighter Bounds on the Information Bottleneck with Application to Deep
Learning
- URL: http://arxiv.org/abs/2402.07639v1
- Date: Mon, 12 Feb 2024 13:24:32 GMT
- Title: Tighter Bounds on the Information Bottleneck with Application to Deep
Learning
- Authors: Nir Weingarten, Zohar Yakhini, Moshe Butman, Ran Gilad-Bachrach
- Abstract summary: Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters.
The Information Bottleneck (IB) provides a hypothetically optimal framework for data modeling, yet it is often intractable.
Recent efforts combined DNNs with the IB by applying VAE-inspired variational methods to approximate bounds on mutual information, resulting in improved robustness to adversarial attacks.
- Score: 6.206127662604578
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Nets (DNNs) learn latent representations induced by their
downstream task, objective function, and other parameters. The quality of the
learned representations impacts the DNN's generalization ability and the
coherence of the emerging latent space. The Information Bottleneck (IB)
provides a hypothetically optimal framework for data modeling, yet it is often
intractable. Recent efforts combined DNNs with the IB by applying VAE-inspired
variational methods to approximate bounds on mutual information, resulting in
improved robustness to adversarial attacks. This work introduces a new and
tighter variational bound for the IB, improving performance of previous
IB-inspired DNNs. These advancements strengthen the case for the IB and its
variational approximations as a data modeling framework, and provide a simple
method to significantly enhance the adversarial robustness of classifier DNNs.
Related papers
- BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks.
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z) - Bayesian Entropy Neural Networks for Physics-Aware Prediction [14.705526856205454]
We introduce BENN, a framework designed to impose constraints on Bayesian Neural Network (BNN) predictions.
Benn is capable of constraining not only the predicted values but also their derivatives and variances, ensuring a more robust and reliable model output.
Results highlight significant improvements over traditional BNNs and showcase competitive performance relative to contemporary constrained deep learning methods.
arXiv Detail & Related papers (2024-07-01T07:00:44Z) - Supervised Gradual Machine Learning for Aspect Category Detection [0.9857683394266679]
Aspect Category Detection (ACD) aims to identify implicit and explicit aspects in a given review sentence.
We propose a novel approach to tackle the ACD task by combining Deep Neural Networks (DNNs) with Gradual Machine Learning (GML) in a supervised setting.
arXiv Detail & Related papers (2024-04-08T07:21:46Z) - FedDIP: Federated Learning with Extreme Dynamic Pruning and Incremental
Regularization [5.182014186927254]
Federated Learning (FL) has been successfully adopted for distributed training and inference of large-scale Deep Neural Networks (DNNs)
We contribute with a novel FL framework (coined FedDIP) which combines (i) dynamic model pruning with error feedback to eliminate redundant information exchange.
We provide convergence analysis of FedDIP and report on a comprehensive performance and comparative assessment against state-of-the-art methods.
arXiv Detail & Related papers (2023-09-13T08:51:19Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications.
In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations.
We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z) - Gradient-Free Adversarial Attacks for Bayesian Neural Networks [9.797319790710713]
adversarial examples underscore the importance of understanding the robustness of machine learning models.
In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs.
arXiv Detail & Related papers (2020-12-23T13:19:11Z) - Towards Robust Neural Networks via Orthogonal Diversity [30.77473391842894]
A series of methods represented by the adversarial training and its variants have proven as one of the most effective techniques in enhancing the Deep Neural Networks robustness.
This paper proposes a novel defense that aims at augmenting the model in order to learn features that are adaptive to diverse inputs, including adversarial examples.
In this way, the proposed DIO augments the model and enhances the robustness of DNN itself as the learned features can be corrected by these mutually-orthogonal paths.
arXiv Detail & Related papers (2020-10-23T06:40:56Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.