A Critical Review of Information Bottleneck Theory and its Applications
to Deep Learning
- URL: http://arxiv.org/abs/2105.04405v2
- Date: Tue, 11 May 2021 11:50:14 GMT
- Title: A Critical Review of Information Bottleneck Theory and its Applications
to Deep Learning
- Authors: Mohammad Ali Alomrani
- Abstract summary: Deep neural networks have seen unparalleled improvements that continue to impact every aspect of today's society.
The information bottleneck theory has emerged as a promising approach to better understand the learning dynamics of neural networks.
The goal of this survey is to provide a comprehensive review of IB theory covering it's information theoretic roots and the recently proposed applications to understand deep learning models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past decade, deep neural networks have seen unparalleled improvements
that continue to impact every aspect of today's society. With the development
of high performance GPUs and the availability of vast amounts of data, learning
capabilities of ML systems have skyrocketed, going from classifying digits in a
picture to beating world-champions in games with super-human performance.
However, even as ML models continue to achieve new frontiers, their practical
success has been hindered by the lack of a deep theoretical understanding of
their inner workings. Fortunately, a known information-theoretic method called
the information bottleneck theory has emerged as a promising approach to better
understand the learning dynamics of neural networks. In principle, IB theory
models learning as a trade-off between the compression of the data and the
retainment of information. The goal of this survey is to provide a
comprehensive review of IB theory covering it's information theoretic roots and
the recently proposed applications to understand deep learning models.
Related papers
- To Compress or Not to Compress- Self-Supervised Learning and Information
Theory: A Review [30.87092042943743]
Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data.
Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels.
Information theory, and notably the information bottleneck principle, has been pivotal in shaping deep neural networks.
arXiv Detail & Related papers (2023-04-19T00:33:59Z) - From Actions to Events: A Transfer Learning Approach Using Improved Deep
Belief Networks [1.0554048699217669]
This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model.
Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process.
arXiv Detail & Related papers (2022-11-30T14:47:10Z) - Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z) - Information Flow in Deep Neural Networks [0.6922389632860545]
There is no comprehensive theoretical understanding of how deep neural networks work or are structured.
Deep networks are often seen as black boxes with unclear interpretations and reliability.
This work aims to apply principles and techniques from information theory to deep learning models to increase our theoretical understanding and design better algorithms.
arXiv Detail & Related papers (2022-02-10T23:32:26Z) - Cognitively Inspired Learning of Incremental Drifting Concepts [31.3178953771424]
Inspired by the nervous system learning mechanisms, we develop a computational model that enables a deep neural network to learn new concepts.
Our model can generate pseudo-data points for experience replay and accumulate new experiences to past learned experiences without causing cross-task interference.
arXiv Detail & Related papers (2021-10-09T23:26:29Z) - Credit Assignment in Neural Networks through Deep Feedback Control [59.14935871979047]
Deep Feedback Control (DFC) is a new learning method that uses a feedback controller to drive a deep neural network to match a desired output target and whose control signal can be used for credit assignment.
The resulting learning rule is fully local in space and time and approximates Gauss-Newton optimization for a wide range of connectivity patterns.
To further underline its biological plausibility, we relate DFC to a multi-compartment model of cortical pyramidal neurons with a local voltage-dependent synaptic plasticity rule, consistent with recent theories of dendritic processing.
arXiv Detail & Related papers (2021-06-15T05:30:17Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z) - Deep Knowledge Tracing with Learning Curves [0.9088303226909278]
We propose a Convolution-Augmented Knowledge Tracing (CAKT) model in this paper.
The model employs three-dimensional convolutional neural networks to explicitly learn a student's recent experience on applying the same knowledge concept with that in the next question.
CAKT achieves the new state-of-the-art performance in predicting students' responses compared with existing models.
arXiv Detail & Related papers (2020-07-26T15:24:51Z) - Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models.
Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model.
Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.