Learning Discrete Structured Representations by Adversarially Maximizing
Mutual Information
- URL: http://arxiv.org/abs/2004.03991v2
- Date: Wed, 15 Jul 2020 18:03:23 GMT
- Title: Learning Discrete Structured Representations by Adversarially Maximizing
Mutual Information
- Authors: Karl Stratos, Sam Wiseman
- Abstract summary: We propose learning discrete structured representations from unlabeled data by maximizing the mutual information between a structured latent variable and a target variable.
Our key technical contribution is an adversarial objective that can be used to tractably estimate mutual information assuming only the feasibility of cross entropy calculation.
We apply our model on document hashing and show that it outperforms current best baselines based on discrete and vector quantized variational autoencoders.
- Score: 39.87273353895564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose learning discrete structured representations from unlabeled data
by maximizing the mutual information between a structured latent variable and a
target variable. Calculating mutual information is intractable in this setting.
Our key technical contribution is an adversarial objective that can be used to
tractably estimate mutual information assuming only the feasibility of cross
entropy calculation. We develop a concrete realization of this general
formulation with Markov distributions over binary encodings. We report critical
and unexpected findings on practical aspects of the objective such as the
choice of variational priors. We apply our model on document hashing and show
that it outperforms current best baselines based on discrete and vector
quantized variational autoencoders. It also yields highly compressed
interpretable representations.
Related papers
- Probabilistic Dataset Reconstruction from Interpretable Models [8.31111379034875]
We show that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones.
Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones.
arXiv Detail & Related papers (2023-08-29T08:10:09Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - FUNCK: Information Funnels and Bottlenecks for Invariant Representation
Learning [7.804994311050265]
We investigate a set of related information funnels and bottleneck problems that claim to learn invariant representations from the data.
We propose a new element to this family of information-theoretic objectives: The Conditional Privacy Funnel with Side Information.
Given the generally intractable objectives, we derive tractable approximations using amortized variational inference parameterized by neural networks.
arXiv Detail & Related papers (2022-11-02T19:37:55Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via
Intermediary Latents [60.785317191131284]
We introduce a simple and effective method for learning VAEs with controllable biases by using an intermediary set of latent variables.
In particular, it allows us to impose desired properties like sparsity or clustering on learned representations.
We show that this, in turn, allows InteL-VAEs to learn both better generative models and representations.
arXiv Detail & Related papers (2021-06-25T16:34:05Z) - From Canonical Correlation Analysis to Self-supervised Graph Neural
Networks [99.44881722969046]
We introduce a conceptually simple yet effective model for self-supervised representation learning with graph data.
We optimize an innovative feature-level objective inspired by classical Canonical Correlation Analysis.
Our method performs competitively on seven public graph datasets.
arXiv Detail & Related papers (2021-06-23T15:55:47Z) - Parsimonious Inference [0.0]
Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures.
Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation.
arXiv Detail & Related papers (2021-03-03T04:13:14Z) - Variational Mutual Information Maximization Framework for VAE Latent
Codes with Continuous and Discrete Priors [5.317548969642376]
Variational Autoencoder (VAE) is a scalable method for learning directed latent variable models of complex data.
We propose Variational Mutual Information Maximization Framework for VAE to address this issue.
arXiv Detail & Related papers (2020-06-02T09:05:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.