Related papers: Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

URL: http://arxiv.org/abs/2212.14117v1
Date: Wed, 28 Dec 2022 22:46:57 GMT
Title: Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm
Authors: Jabri Ismail, Aboulbichr Ahmed and El ouaazizi Aziza
Abstract summary: Current neural network models of dialogue generation show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.

Related papers

Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z)
SAGE: Steering and Refining Dialog Generation with State-Action Augmentation [9.95917154889491]
We present a novel approach called SAGE that uses latent variables to control long-horizon behavior in dialogue generation. At the core of our method is the State-Action Chain (SAC), which augments standard language model fine-tuning. We show that models trained with this approach demonstrate improved performance in emotional intelligence metrics.
arXiv Detail & Related papers (2025-03-04T22:45:24Z)
WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain. These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech. Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z)
Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection [24.71649541757314]
Short backchannel utterances such as "yeah" and "oh" play a crucial role in facilitating smooth and engaging dialogue. This paper proposes a novel method for real-time, continuous backchannel prediction using a fine-tuned Voice Activity Projection model.
arXiv Detail & Related papers (2024-10-21T11:57:56Z)
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs. We employ domain-adaptive training strategies to help the model adapt to the dialogue domains. Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z)
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder. BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks. Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z)
Towards Robust Online Dialogue Response Generation [62.99904593650087]
We argue that this can be caused by a discrepancy between training and real-world testing. We propose a hierarchical sampling-based method consisting of both utterance-level sampling and semi-utterance-level sampling.
arXiv Detail & Related papers (2022-03-07T06:51:41Z)
Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task. Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts. Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z)
CloneBot: Personalized Dialogue-Response Predictions [0.0]
The project task was to create a model that, given a speaker ID, chat history, and an utterance query, can predict the response utterance in a conversation. The model is personalized for each speaker. This task can be a useful tool for building speech bots that talk in a human-like manner in a live conversation.
arXiv Detail & Related papers (2021-03-31T01:15:37Z)
Advances in Multi-turn Dialogue Comprehension: A Survey [51.215629336320305]
We review the previous methods from the perspective of dialogue modeling. We discuss three typical patterns of dialogue modeling that are widely-used in dialogue comprehension tasks.
arXiv Detail & Related papers (2021-03-04T15:50:17Z)
Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration [45.14559188965439]
We propose to apply reinforcement learning to refine an MLE-based language model without user simulators. We distill sentence-level information about repetition, inconsistency and task relevance through rewards. Experiments show that our model outperforms previous state-of-the-art dialogue models on both automatic metrics and human evaluation results.
arXiv Detail & Related papers (2020-12-31T00:02:51Z)
DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances [18.199473005335093]
This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
arXiv Detail & Related papers (2020-12-03T09:06:23Z)
Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation. Previous works usually employ various neural network architectures to model the history. This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.