Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference
- URL: http://arxiv.org/abs/2006.08804v1
- Date: Mon, 15 Jun 2020 22:22:56 GMT
- Title: Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference
- Authors: Hao Zhang, Bo Chen, Yulai Cong, Dandan Guo, Hongwei Liu, Mingyuan Zhou
- Abstract summary: We develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network.
We propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a downward generative model.
The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.
- Score: 55.35176938713946
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To build a flexible and interpretable model for document analysis, we develop
deep autoencoding topic model (DATM) that uses a hierarchy of gamma
distributions to construct its multi-stochastic-layer generative network. In
order to provide scalable posterior inference for the parameters of the
generative network, we develop topic-layer-adaptive stochastic gradient
Riemannian MCMC that jointly learns simplex-constrained global parameters
across all layers and topics, with topic and layer specific learning rates.
Given a posterior sample of the global parameters, in order to efficiently
infer the local latent representations of a document under DATM across all
stochastic layers, we propose a Weibull upward-downward variational encoder
that deterministically propagates information upward via a deep neural network,
followed by a Weibull distribution based stochastic downward generative model.
To jointly model documents and their associated labels, we further propose
supervised DATM that enhances the discriminative power of its latent
representations. The efficacy and scalability of our models are demonstrated on
both unsupervised and supervised learning tasks on big corpora.
Related papers
- Scalable Weibull Graph Attention Autoencoder for Modeling Document Networks [50.42343781348247]
We develop a graph Poisson factor analysis (GPFA) which provides analytic conditional posteriors to improve the inference accuracy.
We also extend GPFA to a multi-stochastic-layer version named graph Poisson gamma belief network (GPGBN) to capture the hierarchical document relationships at multiple semantic levels.
Our models can extract high-quality hierarchical latent document representations and achieve promising performance on various graph analytic tasks.
arXiv Detail & Related papers (2024-10-13T02:22:14Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models [106.94827590977337]
We propose a novel world model for Multi-Agent RL (MARL) that learns decentralized local dynamics for scalability.
We also introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation.
Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.
arXiv Detail & Related papers (2024-06-22T12:40:03Z) - Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations.
We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs)
EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding.
Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z) - DAMNETS: A Deep Autoregressive Model for Generating Markovian Network
Time Series [6.834250594353335]
Generative models for network time series (also known as dynamic graphs) have tremendous potential in fields such as epidemiology, biology and economics.
Here we introduce DAMNETS, a scalable deep generative model for network time series.
arXiv Detail & Related papers (2022-03-28T18:14:04Z) - Hierarchical Graph-Convolutional Variational AutoEncoding for Generative
Modelling of Human Motion [1.2599533416395767]
Models of human motion commonly focus either on trajectory prediction or action classification but rarely both.
Here we propose a novel architecture based on hierarchical variational autoencoders and deep graph convolutional neural networks for generating a holistic model of action over multiple time-scales.
We show this Hierarchical Graph-conational Varivolutional Autoencoder (HG-VAE) to be capable of generating coherent actions, detecting out-of-distribution data, and imputing missing data by gradient ascent on the model's posterior.
arXiv Detail & Related papers (2021-11-24T16:21:07Z) - A Unified Approach to Variational Autoencoders and Stochastic
Normalizing Flows via Markov Chains [0.45119235878273]
We provide a unified framework to handle normalizing flows and variational autoencoders via Markov chains.
Our framework establishes a useful mathematical tool to combine the various approaches.
arXiv Detail & Related papers (2021-11-24T14:04:32Z) - Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems.
A theory has shown the importance of the gradient descent (GD) to globally optimal solutions.
We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.