Latent Space Energy-Based Model of Symbol-Vector Coupling for Text
Generation and Classification
- URL: http://arxiv.org/abs/2108.11556v1
- Date: Thu, 26 Aug 2021 02:31:18 GMT
- Title: Latent Space Energy-Based Model of Symbol-Vector Coupling for Text
Generation and Classification
- Authors: Bo Pang, Ying Nian Wu
- Abstract summary: We propose a latent space energy-based prior model for text generation and classification.
The model stands on a generator network that generates the text sequence based on a continuous latent vector.
Our experiments demonstrate that the proposed model learns well-structured and meaningful latent space.
- Score: 42.5461221309397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a latent space energy-based prior model for text generation and
classification. The model stands on a generator network that generates the text
sequence based on a continuous latent vector. The energy term of the prior
model couples a continuous latent vector and a symbolic one-hot vector, so that
discrete category can be inferred from the observed example based on the
continuous latent vector. Such a latent space coupling naturally enables
incorporation of information bottleneck regularization to encourage the
continuous latent vector to extract information from the observed example that
is informative of the underlying category. In our learning method, the
symbol-vector coupling, the generator network and the inference network are
learned jointly. Our model can be learned in an unsupervised setting where no
category labels are provided. It can also be learned in semi-supervised setting
where category labels are provided for a subset of training examples. Our
experiments demonstrate that the proposed model learns well-structured and
meaningful latent space, which (1) guides the generator to generate text with
high quality, diversity, and interpretability, and (2) effectively classifies
text.
Related papers
- Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction [4.887047578768969]
We introduce complexity measures of the local topology of the latent space of a contextual language model.
Our work continues a line of research that explores the manifold hypothesis for word embeddings.
arXiv Detail & Related papers (2024-08-07T11:44:32Z) - InstaGen: Enhancing Object Detection by Training on Synthetic Dataset [59.445498550159755]
We present a novel paradigm to enhance the ability of object detector, e.g., expanding categories or improving detection performance.
We integrate an instance-level grounding head into a pre-trained, generative diffusion model, to augment it with the ability of localising instances in the generated images.
We conduct thorough experiments to show that, this enhanced version of diffusion model, termed as InstaGen, can serve as a data synthesizer.
arXiv Detail & Related papers (2024-02-08T18:59:53Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - TAX: Tendency-and-Assignment Explainer for Semantic Segmentation with
Multi-Annotators [31.36818611460614]
Tendency-and-Assignment Explainer (TAX) is designed to offer interpretability at the annotator and assignment levels.
We show that our TAX can be applied to state-of-the-art network architectures with comparable performances.
arXiv Detail & Related papers (2023-02-19T12:40:22Z) - Learning Cluster Patterns for Abstractive Summarization [0.0]
We consider two clusters of salient and non-salient context vectors, using which the decoder can attend more to salient context vectors for summary generation.
Our experimental result shows that the proposed model outperforms the existing BART model by learning these distinct cluster patterns.
arXiv Detail & Related papers (2022-02-22T15:15:24Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Semi-supervised Learning by Latent Space Energy-Based Model of
Symbol-Vector Coupling [55.866810975092115]
We propose a latent space energy-based prior model for semi-supervised learning.
We show that our method performs well on semi-supervised learning tasks.
arXiv Detail & Related papers (2020-10-19T09:55:14Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z) - Background Knowledge Injection for Interpretable Sequence Classification [13.074542699823933]
We introduce a novel sequence learning algorithm that balances predictive power and interpretability.
We extend the classic subsequence feature space with groups of symbols generated by background knowledge injected via word or graph embeddings.
We also present a new measure to evaluate the interpretability of a set of symbolic features based on the symbol embeddings.
arXiv Detail & Related papers (2020-06-25T08:36:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.