A Bayesian Framework for Information-Theoretic Probing
- URL: http://arxiv.org/abs/2109.03853v1
- Date: Wed, 8 Sep 2021 18:08:36 GMT
- Title: A Bayesian Framework for Information-Theoretic Probing
- Authors: Tiago Pimentel, Ryan Cotterell
- Abstract summary: We argue that probing should be seen as approximating a mutual information.
This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences.
This paper proposes a new framework to measure what we term Bayesian mutual information.
- Score: 51.98576673620385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pimentel et al. (2020) recently analysed probing from an
information-theoretic perspective. They argue that probing should be seen as
approximating a mutual information. This led to the rather unintuitive
conclusion that representations encode exactly the same information about a
target task as the original sentences. The mutual information, however, assumes
the true probability distribution of a pair of random variables is known,
leading to unintuitive results in settings where it is not. This paper proposes
a new framework to measure what we term Bayesian mutual information, which
analyses information from the perspective of Bayesian agents -- allowing for
more intuitive findings in scenarios with finite data. For instance, under
Bayesian MI we have that data can add information, processing can help, and
information can hurt, which makes it more intuitive for machine learning
applications. Finally, we apply our framework to probing where we believe
Bayesian mutual information naturally operationalises ease of extraction by
explicitly limiting the available background knowledge to solve a task.
Related papers
- InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification [2.878018421751116]
We employ information entropy neural estimation to utilize the potential of unlabeled samples.
Inspired by contrastive learning, the entropy is estimated by maximizing a lower bound on mutual information.
We show our method's superior performance in extensive experiments.
arXiv Detail & Related papers (2024-04-17T02:29:44Z) - Improvement and generalization of ABCD method with Bayesian inference [36.136619420474766]
We focus on taking advantage of available information and pour our effort in re-thinking the usual data-driven ABCD method to improve it.
We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds.
We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction.
arXiv Detail & Related papers (2024-02-12T19:05:27Z) - Gaussian Mixture Models for Affordance Learning using Bayesian Networks [50.18477618198277]
Affordances are fundamental descriptors of relationships between actions, objects and effects.
This paper approaches the problem of an embodied agent exploring the world and learning these affordances autonomously from its sensory experiences.
arXiv Detail & Related papers (2024-02-08T22:05:45Z) - On the Properties and Estimation of Pointwise Mutual Information Profiles [49.877314063833296]
The pointwise mutual information profile, or simply profile, is the distribution of pointwise mutual information for a given pair of random variables.
We introduce a novel family of distributions, Bend and Mix Models, for which the profile can be accurately estimated using Monte Carlo methods.
arXiv Detail & Related papers (2023-10-16T10:02:24Z) - Inferential Moments of Uncertain Multivariable Systems [0.0]
We treat Bayesian probability updating as a random process and uncover intrinsic quantitative features of joint probability distributions called inferential moments.
Inferential moments quantify shape information about how a prior distribution is expected to update in response to yet to be obtained information.
We find a power series expansion of the mutual information in terms of inferential moments, which implies a connection between inferential theoretic logic and elements of information theory.
arXiv Detail & Related papers (2023-05-03T00:56:12Z) - FUNCK: Information Funnels and Bottlenecks for Invariant Representation
Learning [7.804994311050265]
We investigate a set of related information funnels and bottleneck problems that claim to learn invariant representations from the data.
We propose a new element to this family of information-theoretic objectives: The Conditional Privacy Funnel with Side Information.
Given the generally intractable objectives, we derive tractable approximations using amortized variational inference parameterized by neural networks.
arXiv Detail & Related papers (2022-11-02T19:37:55Z) - Unifying Approaches in Data Subset Selection via Fisher Information and
Information-Theoretic Quantities [38.59619544501593]
We revisit the Fisher information and use it to show how several otherwise disparate methods are connected as approximations of information-theoretic quantities.
In data subset selection, i.e. active learning and active sampling, several recent works use Fisher information, Hessians, similarity matrices based on the gradients, or simply the gradient lengths to compute the acquisition scores that guide sample selection.
arXiv Detail & Related papers (2022-08-01T00:36:57Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Bayesian Inference Forgetting [82.6681466124663]
The right to be forgotten has been legislated in many countries but the enforcement in machine learning would cause unbearable costs.
This paper proposes a it Bayesian inference forgetting (BIF) framework to realize the right to be forgotten in Bayesian inference.
arXiv Detail & Related papers (2021-01-16T09:52:51Z) - A Theory of Usable Information Under Computational Constraints [103.5901638681034]
We propose a new framework for reasoning about information in complex systems.
Our foundation is based on a variational extension of Shannon's information theory.
We show that by incorporating computational constraints, $mathcalV$-information can be reliably estimated from data.
arXiv Detail & Related papers (2020-02-25T06:09:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.