Cue-word Driven Neural Response Generation with a Shrinking Vocabulary
- URL: http://arxiv.org/abs/2010.04927v1
- Date: Sat, 10 Oct 2020 07:13:32 GMT
- Title: Cue-word Driven Neural Response Generation with a Shrinking Vocabulary
- Authors: Qiansheng Wang, Yuxin Liu, Chengguo Lv, Zhen Wang and Guohong Fu
- Abstract summary: We propose a novel but natural approach that can produce multiple cue-words during decoding, and then uses the produced cue-words to drive decoding and shrinks the decoding vocabulary.
Experimental results show that our approach significantly outperforms several strong baseline models with much lower decoding complexity.
- Score: 8.021536281277044
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-domain response generation is the task of generating sensible and
informative re-sponses to the source sentence. However, neural models tend to
generate safe and mean-ingless responses. While cue-word introducing approaches
encourage responses with concrete semantics and have shown tremendous
potential, they still fail to explore di-verse responses during decoding. In
this paper, we propose a novel but natural approach that can produce multiple
cue-words during decoding, and then uses the produced cue-words to drive
decoding and shrinks the decoding vocabulary. Thus the neural genera-tion model
can explore the full space of responses and discover informative ones with
efficiency. Experimental results show that our approach significantly
outperforms several strong baseline models with much lower decoding complexity.
Especially, our approach can converge to concrete semantics more efficiently
during decoding.
Related papers
- Interpretable Language Modeling via Induction-head Ngram Models [74.26720927767398]
We propose Induction-head ngram models (Induction-Gram) to bolster modern ngram models with a hand-engineered "induction head"
This induction head uses a custom neural similarity metric to efficiently search the model's input context for potential next-word completions.
Experiments show that this simple method significantly improves next-word prediction over baseline interpretable models.
arXiv Detail & Related papers (2024-10-31T12:33:26Z) - Neural paraphrasing by automatically crawled and aligned sentence pairs [11.95795974003684]
The main obstacle toward neural-network-based paraphrasing is the lack of large datasets with aligned pairs of sentences and paraphrases.
We present a method for the automatic generation of large aligned corpora, that is based on the assumption that news and blog websites talk about the same events using different narrative styles.
We propose a similarity search procedure with linguistic constraints that, given a reference sentence, is able to locate the most similar candidate paraphrases out from millions of indexed sentences.
arXiv Detail & Related papers (2024-02-16T10:40:38Z) - Self-consistent context aware conformer transducer for speech recognition [0.06008132390640294]
We introduce a novel neural network module that adeptly handles recursive data flow in neural network architectures.
Our method notably improves the accuracy of recognizing rare words without adversely affecting the word error rate for common vocabulary.
Our findings reveal that the combination of both approaches can improve the accuracy of detecting rare words by as much as 4.5 times.
arXiv Detail & Related papers (2024-02-09T18:12:11Z) - Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization [76.57699934689468]
We propose a fine-grained Token-level retrieval-augmented mechanism (Tram) on the decoder side to enhance the performance of neural models.
To overcome the challenge of token-level retrieval in capturing contextual code semantics, we also propose integrating code semantics into individual summary tokens.
arXiv Detail & Related papers (2023-05-18T16:02:04Z) - Surrogate Gradient Spiking Neural Networks as Encoders for Large
Vocabulary Continuous Speech Recognition [91.39701446828144]
We show that spiking neural networks can be trained like standard recurrent neural networks using the surrogate gradient method.
They have shown promising results on speech command recognition tasks.
In contrast to their recurrent non-spiking counterparts, they show robustness to exploding gradient problems without the need to use gates.
arXiv Detail & Related papers (2022-12-01T12:36:26Z) - Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models.
Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z) - Improving Response Quality with Backward Reasoning in Open-domain
Dialogue Systems [53.160025961101354]
We propose to train the generation model in a bidirectional manner by adding a backward reasoning step to the vanilla encoder-decoder training.
The proposed backward reasoning step pushes the model to produce more informative and coherent content.
Our method can improve response quality without introducing side information.
arXiv Detail & Related papers (2021-04-30T20:38:27Z) - Deep Recurrent Encoder: A scalable end-to-end network to model brain
signals [122.1055193683784]
We propose an end-to-end deep learning architecture trained to predict the brain responses of multiple subjects at once.
We successfully test this approach on a large cohort of magnetoencephalography (MEG) recordings acquired during a one-hour reading task.
arXiv Detail & Related papers (2021-03-03T11:39:17Z) - Predict and Use Latent Patterns for Short-Text Conversation [5.757975605648179]
We propose to use more detailed semantic forms, including latent responses and part-of-speech sequences, as the controllable semantics to guide the generation.
Our results show that the richer semantics are not only able to provide informative and diverse responses, but also increase the overall performance of response quality.
arXiv Detail & Related papers (2020-10-27T01:31:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.