Investigation of Sentiment Controllable Chatbot
- URL: http://arxiv.org/abs/2007.07196v1
- Date: Sat, 11 Jul 2020 16:04:30 GMT
- Title: Investigation of Sentiment Controllable Chatbot
- Authors: Hung-yi Lee, Cheng-Hao Ho, Chien-Fu Lin, Chiung-Chih Chang, Chih-Wei
Lee, Yau-Shian Wang, Tsung-Yuan Hsu and Kuan-Yu Chen
- Abstract summary: In this paper, we investigate four models to scale or adjust the sentiment of the response.
The models are a persona-based model, reinforcement learning, a plug and play model, and CycleGAN.
We develop machine-evaluated metrics to estimate whether the responses are reasonable given the input.
- Score: 50.34061353512263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional seq2seq chatbot models attempt only to find sentences with the
highest probabilities conditioned on the input sequences, without considering
the sentiment of the output sentences. In this paper, we investigate four
models to scale or adjust the sentiment of the chatbot response: a
persona-based model, reinforcement learning, a plug and play model, and
CycleGAN, all based on the seq2seq model. We also develop machine-evaluated
metrics to estimate whether the responses are reasonable given the input. These
metrics, together with human evaluation, are used to analyze the performance of
the four models in terms of different aspects; reinforcement learning and
CycleGAN are shown to be very attractive.
Related papers
- A Conditional Generative Chatbot using Transformer Model [30.613612803419294]
In this paper, a novel architecture is proposed using conditional Wasserstein Generative Adrial Networks and a transformer model for answer generation.
To the best of our knowledge, this is the first time that a generative is proposed using the embedded transformer in both generator and discriminator models.
The results of the proposed model on the Cornell Movie-Dialog corpus and the Chit-Chat datasets confirm the superiority of the proposed model compared to state-of-the-art alternatives.
arXiv Detail & Related papers (2023-06-03T10:35:04Z) - Enhancing Self-Consistency and Performance of Pre-Trained Language
Models through Natural Language Inference [72.61732440246954]
Large pre-trained language models often lack logical consistency across test inputs.
We propose a framework, ConCoRD, for boosting the consistency and accuracy of pre-trained NLP models.
We show that ConCoRD consistently boosts accuracy and consistency of off-the-shelf closed-book QA and VQA models.
arXiv Detail & Related papers (2022-11-21T21:58:30Z) - Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations.
We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property.
For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z) - Transformer Based Bengali Chatbot Using General Knowledge Dataset [0.0]
In this research, we applied the transformer model for Bengali general knowledge chatbots based on the Bengali general knowledge Question Answer (QA) dataset.
It scores 85.0 BLEU on the applied QA data. To check the comparison of the transformer model performance, we trained the seq2seq model with attention on our dataset that scores 23.5 BLEU.
arXiv Detail & Related papers (2021-11-06T18:33:20Z) - Turning Tables: Generating Examples from Semi-structured Tables for
Endowing Language Models with Reasoning Skills [32.55545292360155]
We propose to leverage semi-structured tables, and automatically generate at scale question-paragraph pairs.
We add a pre-training step over this synthetic data, which includes examples that require 16 different reasoning skills.
We show that our model, PReasM, substantially outperforms T5, a popular pre-trained encoder-decoder model.
arXiv Detail & Related papers (2021-07-15T11:37:14Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - Is Automated Topic Model Evaluation Broken?: The Incoherence of
Coherence [62.826466543958624]
We look at the standardization gap and the validation gap in topic model evaluation.
Recent models relying on neural components surpass classical topic models according to these metrics.
We use automatic coherence along with the two most widely accepted human judgment tasks, namely, topic rating and word intrusion.
arXiv Detail & Related papers (2021-07-05T17:58:52Z) - What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets.
We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z) - Chatbot Interaction with Artificial Intelligence: Human Data
Augmentation with T5 and Language Transformer Ensemble for Text
Classification [2.492300648514128]
We present the Interaction with Artificial Intelligence (CI-AI) framework as an approach to the training of deep learning chatbots for task classification.
The intelligent system augments human-sourced data via artificial paraphrasing in order to generate a large set of training data.
We find that all models are improved when training data is augmented by the T5 model.
arXiv Detail & Related papers (2020-10-12T19:37:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.