Related papers: Tighter Bounds on the Information Bottleneck with Application to Deep Learning

Tighter Bounds on the Information Bottleneck with Application to Deep Learning

URL: http://arxiv.org/abs/2402.07639v1
Date: Mon, 12 Feb 2024 13:24:32 GMT
Title: Tighter Bounds on the Information Bottleneck with Application to Deep Learning
Authors: Nir Weingarten, Zohar Yakhini, Moshe Butman, Ran Gilad-Bachrach
Abstract summary: Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters. The Information Bottleneck (IB) provides a hypothetically optimal framework for data modeling, yet it is often intractable. Recent efforts combined DNNs with the IB by applying VAE-inspired variational methods to approximate bounds on mutual information, resulting in improved robustness to adversarial attacks.
Score: 6.206127662604578
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters. The quality of the learned representations impacts the DNN's generalization ability and the coherence of the emerging latent space. The Information Bottleneck (IB) provides a hypothetically optimal framework for data modeling, yet it is often intractable. Recent efforts combined DNNs with the IB by applying VAE-inspired variational methods to approximate bounds on mutual information, resulting in improved robustness to adversarial attacks. This work introduces a new and tighter variational bound for the IB, improving performance of previous IB-inspired DNNs. These advancements strengthen the case for the IB and its variational approximations as a data modeling framework, and provide a simple method to significantly enhance the adversarial robustness of classifier DNNs.

Related papers

Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z)
Information-Bottleneck Driven Binary Neural Network for Change Detection [53.866667209237434]
Binarized Change Detection (BiCD) is the first binary neural network (BNN) designed specifically for change detection.<n>We introduce an auxiliary objective based on the Information Bottleneck (IB) principle, guiding the encoder to retain essential input information.<n>BiCD establishes a new benchmark for BNN-based change detection, achieving state-of-the-art performance in this domain.
arXiv Detail & Related papers (2025-07-04T11:56:16Z)
On Local Posterior Structure in Deep Ensembles [5.03927767211033]
Deep ensembles of BNNs (DEs) are known to improve calibration. We find that when the ensembles grow large enough, DEs consistently outperform DE-BNNs on in-distribution data. As a final contribution, we open-source the large pool of trained models to facilitate further research on this topic.
arXiv Detail & Related papers (2025-03-17T15:41:39Z)
RelGNN: Composite Message Passing for Relational Deep Learning [56.48834369525997]
We introduce RelGNN, a novel GNN framework specifically designed to capture the unique characteristics of relational databases. At the core of our approach is the introduction of atomic routes, which are sequences of nodes forming high-order tripartite structures. RelGNN consistently achieves state-of-the-art accuracy with up to 25% improvement.
arXiv Detail & Related papers (2025-02-10T18:58:40Z)
Structured IB: Improving Information Bottleneck with Structured Feature Learning [32.774660308233635]
We introduce Structured IB, a framework for investigating potential structured features. Our experiments demonstrate superior prediction accuracy and task-relevant information compared to the original IB Lagrangian method.
arXiv Detail & Related papers (2024-12-11T09:17:45Z)
BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks. BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z)
Bayesian Entropy Neural Networks for Physics-Aware Prediction [14.705526856205454]
We introduce BENN, a framework designed to impose constraints on Bayesian Neural Network (BNN) predictions. Benn is capable of constraining not only the predicted values but also their derivatives and variances, ensuring a more robust and reliable model output. Results highlight significant improvements over traditional BNNs and showcase competitive performance relative to contemporary constrained deep learning methods.
arXiv Detail & Related papers (2024-07-01T07:00:44Z)
Supervised Gradual Machine Learning for Aspect Category Detection [0.9857683394266679]
Aspect Category Detection (ACD) aims to identify implicit and explicit aspects in a given review sentence. We propose a novel approach to tackle the ACD task by combining Deep Neural Networks (DNNs) with Gradual Machine Learning (GML) in a supervised setting.
arXiv Detail & Related papers (2024-04-08T07:21:46Z)
FedDIP: Federated Learning with Extreme Dynamic Pruning and Incremental Regularization [5.182014186927254]
Federated Learning (FL) has been successfully adopted for distributed training and inference of large-scale Deep Neural Networks (DNNs) We contribute with a novel FL framework (coined FedDIP) which combines (i) dynamic model pruning with error feedback to eliminate redundant information exchange. We provide convergence analysis of FedDIP and report on a comprehensive performance and comparative assessment against state-of-the-art methods.
arXiv Detail & Related papers (2023-09-13T08:51:19Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training. We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z)
On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications. In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations. We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z)
Gradient-Free Adversarial Attacks for Bayesian Neural Networks [9.797319790710713]
adversarial examples underscore the importance of understanding the robustness of machine learning models. In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs.
arXiv Detail & Related papers (2020-12-23T13:19:11Z)
Towards Robust Neural Networks via Orthogonal Diversity [30.77473391842894]
A series of methods represented by the adversarial training and its variants have proven as one of the most effective techniques in enhancing the Deep Neural Networks robustness. This paper proposes a novel defense that aims at augmenting the model in order to learn features that are adaptive to diverse inputs, including adversarial examples. In this way, the proposed DIO augments the model and enhances the robustness of DNN itself as the learned features can be corrected by these mutually-orthogonal paths.
arXiv Detail & Related papers (2020-10-23T06:40:56Z)
Explaining and Improving Model Behavior with k Nearest Neighbor Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions. We show that kNN representations are effective at uncovering learned spurious associations. Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.