Stateful Memory-Augmented Transformers for Efficient Dialogue Modeling
- URL: http://arxiv.org/abs/2209.07634v2
- Date: Tue, 23 May 2023 05:59:06 GMT
- Title: Stateful Memory-Augmented Transformers for Efficient Dialogue Modeling
- Authors: Qingyang Wu and Zhou Yu
- Abstract summary: We propose a novel memory-augmented transformer that is compatible with existing pre-trained encoder-decoder models.
By incorporating a separate memory module alongside the pre-trained transformer, the model can effectively interchange information between the memory states and the current input context.
- Score: 69.31802246621963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer encoder-decoder models have achieved great performance in
dialogue generation tasks, however, their inability to process long dialogue
history often leads to truncation of the context To address this problem, we
propose a novel memory-augmented transformer that is compatible with existing
pre-trained encoder-decoder models and enables efficient preservation of the
dialogue history information. By incorporating a separate memory module
alongside the pre-trained transformer, the model can effectively interchange
information between the memory states and the current input context. We
evaluate our model on three dialogue datasets and two language modeling
datasets. Experimental results show that our method has achieved superior
efficiency and performance compared to other pre-trained Transformer baselines.
Related papers
- Improving Transformer-based Conversational ASR by Inter-Sentential
Attention Mechanism [20.782319059183173]
We propose to explicitly model the inter-sentential information in a Transformer based end-to-end architecture for conversational speech recognition.
We show the effectiveness of our proposed method on several open-source dialogue corpora and the proposed method consistently improved the performance from the utterance-level Transformer-based ASR models.
arXiv Detail & Related papers (2022-07-02T17:17:47Z) - DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for
Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses.
We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z) - A Model-Agnostic Data Manipulation Method for Persona-based Dialogue
Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets.
Each data sample in this task is more complex to learn with than conventional dialogue data.
We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z) - An Exploratory Study on Long Dialogue Summarization: What Works and
What's Next [33.1899354772074]
We study long dialogue summarization by investigating three strategies to deal with the lengthy input problem and locate relevant information.
Our experimental results on three long dialogue datasets (QMSum, MediaSum, SummScreen) show that the retrieve-then-summarize pipeline models yield the best performance.
arXiv Detail & Related papers (2021-09-10T01:38:26Z) - Retrieval-Augmented Transformer-XL for Close-Domain Dialog Generation [16.90730526207747]
We present a transformer-based model for multi-turn dialog response generation.
Our solution is based on a hybrid approach which augments a transformer-based generative model with a novel retrieval mechanism.
arXiv Detail & Related papers (2021-05-19T16:34:33Z) - Parameter Efficient Multimodal Transformers for Video Representation
Learning [108.8517364784009]
This work focuses on reducing the parameters of multimodal Transformers in the context of audio-visual video representation learning.
We show that our approach reduces parameters up to 80$%$, allowing us to train our model end-to-end from scratch.
To demonstrate our approach, we pretrain our model on 30-second clips from Kinetics-700 and transfer it to audio-visual classification tasks.
arXiv Detail & Related papers (2020-12-08T00:16:13Z) - Modifying Memories in Transformer Models [71.48657481835767]
We propose a new task of emphexplicitly modifying specific factual knowledge in Transformer models.
This task is useful in many scenarios, such as updating stale knowledge, protecting privacy, and eliminating unintended biases stored in the models.
arXiv Detail & Related papers (2020-12-01T09:39:13Z) - Open-Domain Dialogue Generation Based on Pre-trained Language Models [23.828348485513043]
Pre-trained language models have been successfully used in response generation for open-domain dialogue.
Four main frameworks have been proposed: Transformer-ED using Transformer encoder and decoder separately for source and target sentences; (2) Transformer-Dec using Transformer decoder for both source and target sentences; and (3) Transformer-MLM using Transformer decoder that applies bi-directional attention on the source side and left-to-right attention on the target side with masked language model objective.
We compare these frameworks on 3 datasets, and our comparison reveals that the best framework uses bidirectional attention on the source side and does not separate encoder and decoder.
arXiv Detail & Related papers (2020-10-24T04:52:28Z) - Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation.
Previous works usually employ various neural network architectures to model the history.
This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z) - Hierarchical Transformer Network for Utterance-level Emotion Recognition [0.0]
We address some challenges in utter-ance-level emotion recognition (ULER)
Unlike the traditional text classification problem, this task is supported by a limited number of datasets.
We use a pretrained language model bidirectional encoder representa-tions from transformers (BERT) as the lower-level transformer.
In addition, we add speaker embeddings to the model for the first time, which enables our model to capture the in-teraction between speakers.
arXiv Detail & Related papers (2020-02-18T13:44:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.