A Theory of Usable Information Under Computational Constraints
- URL: http://arxiv.org/abs/2002.10689v1
- Date: Tue, 25 Feb 2020 06:09:30 GMT
- Title: A Theory of Usable Information Under Computational Constraints
- Authors: Yilun Xu, Shengjia Zhao, Jiaming Song, Russell Stewart, Stefano Ermon
- Abstract summary: We propose a new framework for reasoning about information in complex systems.
Our foundation is based on a variational extension of Shannon's information theory.
We show that by incorporating computational constraints, $mathcalV$-information can be reliably estimated from data.
- Score: 103.5901638681034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new framework for reasoning about information in complex
systems. Our foundation is based on a variational extension of Shannon's
information theory that takes into account the modeling power and computational
constraints of the observer. The resulting \emph{predictive
$\mathcal{V}$-information} encompasses mutual information and other notions of
informativeness such as the coefficient of determination. Unlike Shannon's
mutual information and in violation of the data processing inequality,
$\mathcal{V}$-information can be created through computation. This is
consistent with deep neural networks extracting hierarchies of progressively
more informative features in representation learning. Additionally, we show
that by incorporating computational constraints, $\mathcal{V}$-information can
be reliably estimated from data even in high dimensions with PAC-style
guarantees. Empirically, we demonstrate predictive $\mathcal{V}$-information is
more effective than mutual information for structure learning and fair
representation learning.
Related papers
- Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric
Learning [1.4293924404819704]
We shed new light on the traditional nearest neighbors algorithm from the perspective of information theory.
We propose a robust and interpretable framework for tasks such as classification, regression, density estimation, and anomaly detection using a single model.
Our work showcases the architecture's versatility by achieving state-of-the-art results in classification and anomaly detection.
arXiv Detail & Related papers (2023-11-17T00:35:38Z) - Explaining Neural Networks without Access to Training Data [8.250944452542502]
We consider generating explanations for neural networks in cases where the network's training data is not accessible.
$mathcalI$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability.
We extend the $mathcalI$-Net framework to the cases of standard and soft decision trees as surrogate models.
arXiv Detail & Related papers (2022-06-10T06:10:04Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - A Bayesian Framework for Information-Theoretic Probing [51.98576673620385]
We argue that probing should be seen as approximating a mutual information.
This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences.
This paper proposes a new framework to measure what we term Bayesian mutual information.
arXiv Detail & Related papers (2021-09-08T18:08:36Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - Measuring Information Transfer in Neural Networks [46.37969746096677]
Quantifying the information content in a neural network model is essentially estimating the model's Kolmogorov complexity.
We propose a measure of the generalizable information in a neural network model based on prequential coding.
We show that $L_IT$ is consistently correlated with generalizable information and can be used as a measure of patterns or "knowledge" in a model or a dataset.
arXiv Detail & Related papers (2020-09-16T12:06:42Z) - The Information Bottleneck Problem and Its Applications in Machine
Learning [53.57797720793437]
Inference capabilities of machine learning systems skyrocketed in recent years, now playing a pivotal role in various aspect of society.
The information bottleneck (IB) theory emerged as a bold information-theoretic paradigm for analyzing deep learning (DL) systems.
In this tutorial we survey the information-theoretic origins of this abstract principle, and its recent impact on DL.
arXiv Detail & Related papers (2020-04-30T16:48:51Z) - Modelling and Quantifying Membership Information Leakage in Machine
Learning [14.095523601311374]
We show that complex models, such as deep neural networks, are more susceptible to membership inference attacks.
We show that the amount of the membership information leakage is reduced by $mathcalO(log1/2(delta-1)epsilon-1)$ when using Gaussian $(epsilon,delta)$-differentially-private additive noises.
arXiv Detail & Related papers (2020-01-29T00:42:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.