Related papers: Blocked and Hierarchical Disentangled Representation From Information Theory Perspective

Blocked and Hierarchical Disentangled Representation From Information Theory Perspective

URL: http://arxiv.org/abs/2101.08408v1
Date: Thu, 21 Jan 2021 02:33:55 GMT
Title: Blocked and Hierarchical Disentangled Representation From Information Theory Perspective
Authors: Ziwen Liu, Mingqiang Li, Congying Han
Abstract summary: We propose a blocked and hierarchical variational autoencoder (BHiVAE) to get better-disentangled representation. BHiVAE mainly comes from the information bottleneck theory and information principle. It exhibits excellent disentanglement results in experiments and superior classification accuracy in representation learning.
Score: 0.6875312133832078
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a novel and theoretical model, blocked and hierarchical variational autoencoder (BHiVAE), to get better-disentangled representation. It is well known that information theory has an excellent explanatory meaning for the network, so we start to solve the disentanglement problem from the perspective of information theory. BHiVAE mainly comes from the information bottleneck theory and information maximization principle. Our main idea is that (1) Neurons block not only one neuron node is used to represent attribute, which can contain enough information; (2) Create a hierarchical structure with different attributes on different layers, so that we can segment the information within each layer to ensure that the final representation is disentangled. Furthermore, we present supervised and unsupervised BHiVAE, respectively, where the difference is mainly reflected in the separation of information between different blocks. In supervised BHiVAE, we utilize the label information as the standard to separate blocks. In unsupervised BHiVAE, without extra information, we use the Total Correlation (TC) measure to achieve independence, and we design a new prior distribution of the latent space to guide the representation learning. It also exhibits excellent disentanglement results in experiments and superior classification accuracy in representation learning.

Related papers

Semi-supervised Node Importance Estimation with Informative Distribution Modeling for Uncertainty Regularization [13.745026710984469]
We propose the first semi-supervised node importance estimation framework, i.e., EASING, to improve learning quality for unlabeled data in heterogeneous graphs. Different from previous approaches, EASING explicitly captures uncertainty to reflect the confidence of model predictions. Based on labeled and pseudo-labeled data, EASING develops effective semi-supervised heteroscedastic learning with varying node uncertainty regularization.
arXiv Detail & Related papers (2025-03-26T16:27:06Z)
KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot Node Classification [75.95647590619929]
Zero-Shot Node Classification (ZNC) has been an emerging and crucial task in graph data analysis. We propose a Knowledge-Aware Multi-Faceted framework (KMF) that enhances the richness of label semantics. A novel geometric constraint is developed to alleviate the problem of prototype drift caused by node information aggregation.
arXiv Detail & Related papers (2023-08-15T02:38:08Z)
Trading Information between Latents in Hierarchical Variational Autoencoders [8.122270502556374]
Variational Autoencoders (VAEs) were originally motivated as probabilistic generative models in which one performs approximate Bayesian inference. The proposal of $beta$-VAEs breaks this interpretation and generalizes VAEs to application domains beyond generative modeling. We identify a general class of inference models for which one can split the rate into contributions from each layer, which can then be tuned independently.
arXiv Detail & Related papers (2023-02-09T18:56:11Z)
Towards Explanation for Unsupervised Graph-Level Representation Learning [108.31036962735911]
Existing explanation methods focus on the supervised settings, eg, node classification and graph classification, while the explanation for unsupervised graph-level representation learning is still unexplored. In this paper, we advance the Information Bottleneck principle (IB) to tackle the proposed explanation problem for unsupervised graph representations, which leads to a novel principle, textitUnsupervised Subgraph Information Bottleneck (USIB) We also theoretically analyze the connection between graph representations and explanatory subgraphs on the label space, which reveals that the robustness of representations benefit the fidelity of explanatory subgraphs.
arXiv Detail & Related papers (2022-05-20T02:50:15Z)
Deconfounded Training for Graph Neural Networks [98.06386851685645]
We present a new paradigm of decon training (DTP) that better mitigates the confounding effect and latches on the critical information. Specifically, we adopt the attention modules to disentangle the critical subgraph and trivial subgraph. It allows GNNs to capture a more reliable subgraph whose relation with the label is robust across different distributions.
arXiv Detail & Related papers (2021-12-30T15:22:35Z)
VAE-CE: Visual Contrastive Explanation using Disentangled VAEs [3.5027291542274357]
Variational Autoencoder-based Contrastive Explanation (VAE-CE) We build the model using a disentangled VAE, extended with a new supervised method for disentangling individual dimensions. An analysis on synthetic data and MNIST shows that the approaches to both disentanglement and explanation provide benefits over other methods.
arXiv Detail & Related papers (2021-08-20T13:15:24Z)
Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque. Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z)
Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction [32.21975128854042]
We propose the principle of Maximal Coding Rate Reduction ($textMCR2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features.
arXiv Detail & Related papers (2020-06-15T17:23:55Z)
Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision. Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z)
A Theory of Usable Information Under Computational Constraints [103.5901638681034]
We propose a new framework for reasoning about information in complex systems. Our foundation is based on a variational extension of Shannon's information theory. We show that by incorporating computational constraints, $mathcalV$-information can be reliably estimated from data.
arXiv Detail & Related papers (2020-02-25T06:09:30Z)
Learning Robust Representations via Multi-View Information Bottleneck [41.65544605954621]
Original formulation requires labeled data to identify superfluous information. We extend this ability to the multi-view unsupervised setting, where two views of the same underlying entity are provided but the label is unknown. A theoretical analysis leads to the definition of a new multi-view model that produces state-of-the-art results on the Sketchy dataset and label-limited versions of the MIR-Flickr dataset.
arXiv Detail & Related papers (2020-02-17T16:01:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.