A Discrete Variational Recurrent Topic Model without the
Reparametrization Trick
- URL: http://arxiv.org/abs/2010.12055v1
- Date: Thu, 22 Oct 2020 20:53:44 GMT
- Title: A Discrete Variational Recurrent Topic Model without the
Reparametrization Trick
- Authors: Mehdi Rezaee and Francis Ferraro
- Abstract summary: We show how to learn a neural topic model with discrete random variables.
We show improved perplexity and document understanding across multiple corpora.
- Score: 16.54912614895861
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show how to learn a neural topic model with discrete random
variables---one that explicitly models each word's assigned topic---using
neural variational inference that does not rely on stochastic backpropagation
to handle the discrete variables. The model we utilize combines the expressive
power of neural methods for representing sequences of text with the topic
model's ability to capture global, thematic coherence. Using neural variational
inference, we show improved perplexity and document understanding across
multiple corpora. We examine the effect of prior parameters both on the model
and variational parameters and demonstrate how our approach can compete and
surpass a popular topic model implementation on an automatic measure of topic
quality.
Related papers
- Latent Variable Sequence Identification for Cognitive Models with Neural Bayes Estimation [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space.
Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Discovering interpretable elastoplasticity models via the neural
polynomial method enabled symbolic regressions [0.0]
Conventional neural network elastoplasticity models are often perceived as lacking interpretability.
This paper introduces a two-step machine learning approach that returns mathematical models interpretable by human experts.
arXiv Detail & Related papers (2023-07-24T22:22:32Z) - Diversity-Aware Coherence Loss for Improving Neural Topic Models [20.98172300869239]
We propose a novel diversity-aware coherence loss that encourages the model to learn corpus-level coherence scores.
Experimental results on multiple datasets show that our method significantly improves the performance of neural topic models.
arXiv Detail & Related papers (2023-05-25T16:01:56Z) - Neural Dynamic Focused Topic Model [2.9005223064604078]
We leverage recent advances in neural variational inference and present an alternative neural approach to the dynamic Focused Topic Model.
We develop a neural model for topic evolution which exploits sequences of Bernoulli random variables in order to track the appearances of topics.
arXiv Detail & Related papers (2023-01-26T08:37:34Z) - Learning Semantic Textual Similarity via Topic-informed Discrete Latent
Variables [17.57873577962635]
We develop a topic-informed discrete latent variable model for semantic textual similarity.
Our model learns a shared latent space for sentence-pair representation via vector quantization.
We show that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.
arXiv Detail & Related papers (2022-11-07T15:09:58Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - Variational Hyper RNN for Sequence Modeling [69.0659591456772]
We propose a novel probabilistic sequence model that excels at capturing high variability in time series data.
Our method uses temporal latent variables to capture information about the underlying data pattern.
The efficacy of the proposed method is demonstrated on a range of synthetic and real-world sequential data.
arXiv Detail & Related papers (2020-02-24T19:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.