PVGRU: Generating Diverse and Relevant Dialogue Responses via
Pseudo-Variational Mechanism
- URL: http://arxiv.org/abs/2212.09086v4
- Date: Tue, 16 May 2023 11:29:24 GMT
- Title: PVGRU: Generating Diverse and Relevant Dialogue Responses via
Pseudo-Variational Mechanism
- Authors: Yongkang Liu and Shi Feng and Daling Wang and Yifei Zhang and Hinrich
Sch\"utze
- Abstract summary: Existing generative models usually employ the last hidden state to summarize the sequences.
We propose a Pseudo-Variational Gated Recurrent Unit (PVGRU) component without posterior knowledge.
PVGRU can perceive the subtle semantic variability through summarizing variables that are optimized by the devised distribution consistency and reconstruction objectives.
- Score: 21.705686583621816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate response generation for multi-turn dialogue in
generative-based chatbots. Existing generative models based on RNNs (Recurrent
Neural Networks) usually employ the last hidden state to summarize the
sequences, which makes models unable to capture the subtle variability observed
in different dialogues and cannot distinguish the differences between dialogues
that are similar in composition. In this paper, we propose a Pseudo-Variational
Gated Recurrent Unit (PVGRU) component without posterior knowledge through
introducing a recurrent summarizing variable into the GRU, which can aggregate
the accumulated distribution variations of subsequences. PVGRU can perceive the
subtle semantic variability through summarizing variables that are optimized by
the devised distribution consistency and reconstruction objectives. In
addition, we build a Pseudo-Variational Hierarchical Dialogue (PVHD) model
based on PVGRU. Experimental results demonstrate that PVGRU can broadly improve
the diversity and relevance of responses on two benchmark datasets.
Related papers
- A Non-negative VAE:the Generalized Gamma Belief Network [49.970917207211556]
The gamma belief network (GBN) has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data.
We introduce the generalized gamma belief network (Generalized GBN) in this paper, which extends the original linear generative model to a more expressive non-linear generative model.
We also propose an upward-downward Weibull inference network to approximate the posterior distribution of the latent variables.
arXiv Detail & Related papers (2024-08-06T18:18:37Z) - PGODE: Towards High-quality System Dynamics Modeling [40.76121531452706]
This paper studies the problem of modeling multi-agent dynamical systems, where agents could interact mutually to influence their behaviors.
Recent research predominantly uses geometric graphs to depict these mutual interactions, which are then captured by graph neural networks (GNNs)
We propose a new approach named Prototypical Graph ODE to address the problem.
arXiv Detail & Related papers (2023-11-11T12:04:47Z) - Diverse and Faithful Knowledge-Grounded Dialogue Generation via
Sequential Posterior Inference [82.28542500317445]
We present an end-to-end learning framework, termed Sequential Posterior Inference (SPI), capable of selecting knowledge and generating dialogues.
Unlike other methods, SPI does not require the inference network or assume a simple geometry of the posterior distribution.
arXiv Detail & Related papers (2023-06-01T21:23:13Z) - Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation
via Hybrid Latent Variables [20.66743177460193]
We combine the merits of both continuous and discrete latent variables and propose a Hybrid Latent Variable (HLV) method.
HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables.
In addition, we propose Conditional Hybrid Variational Transformer (CHVT) to construct and to utilize HLV with transformers for dialogue generation.
arXiv Detail & Related papers (2022-12-02T12:48:01Z) - A Correspondence Variational Autoencoder for Unsupervised Acoustic Word
Embeddings [50.524054820564395]
We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation.
The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and zero-resource languages.
arXiv Detail & Related papers (2020-12-03T19:24:42Z) - Diversifying Task-oriented Dialogue Response Generation with Prototype
Guided Paraphrasing [52.71007876803418]
Existing methods for Dialogue Response Generation (DRG) in Task-oriented Dialogue Systems ( TDSs) can be grouped into two categories: template-based and corpus-based.
We propose a prototype-based, paraphrasing neural network, called P2-Net, which aims to enhance quality of the responses in terms of both precision and diversity.
arXiv Detail & Related papers (2020-08-07T22:25:36Z) - Generalized Adversarially Learned Inference [42.40405470084505]
We develop methods of inference of latent variables in GANs by adversarially training an image generator along with an encoder to match two joint distributions of image and latent vector pairs.
We incorporate multiple layers of feedback on reconstructions, self-supervision, and other forms of supervision based on prior or learned knowledge about the desired solutions.
arXiv Detail & Related papers (2020-06-15T02:18:13Z) - Variational Transformers for Diverse Response Generation [71.53159402053392]
Variational Transformer (VT) is a variational self-attentive feed-forward sequence model.
VT combines the parallelizability and global receptive field computation of the Transformer with the variational nature of the CVAE.
We explore two types of VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of finegrained latent variables.
arXiv Detail & Related papers (2020-03-28T07:48:02Z) - Variational Inference for Deep Probabilistic Canonical Correlation
Analysis [49.36636239154184]
We propose a deep probabilistic multi-view model that is composed of a linear multi-view layer and deep generative networks as observation models.
An efficient variational inference procedure is developed that approximates the posterior distributions of the latent probabilistic multi-view layer.
A generalization to models with arbitrary number of views is also proposed.
arXiv Detail & Related papers (2020-03-09T17:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.