A Conditional Generative Chatbot using Transformer Model
- URL: http://arxiv.org/abs/2306.02074v2
- Date: Fri, 8 Sep 2023 17:15:37 GMT
- Title: A Conditional Generative Chatbot using Transformer Model
- Authors: Nura Esfandiari, Kourosh Kiani, Razieh Rastgoo
- Abstract summary: In this paper, a novel architecture is proposed using conditional Wasserstein Generative Adrial Networks and a transformer model for answer generation.
To the best of our knowledge, this is the first time that a generative is proposed using the embedded transformer in both generator and discriminator models.
The results of the proposed model on the Cornell Movie-Dialog corpus and the Chit-Chat datasets confirm the superiority of the proposed model compared to state-of-the-art alternatives.
- Score: 30.613612803419294
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A Chatbot serves as a communication tool between a human user and a machine
to achieve an appropriate answer based on the human input. In more recent
approaches, a combination of Natural Language Processing and sequential models
are used to build a generative Chatbot. The main challenge of these models is
their sequential nature, which leads to less accurate results. To tackle this
challenge, in this paper, a novel architecture is proposed using conditional
Wasserstein Generative Adversarial Networks and a transformer model for answer
generation in Chatbots. While the generator of the proposed model consists of a
full transformer model to generate an answer, the discriminator includes only
the encoder part of a transformer model followed by a classifier. To the best
of our knowledge, this is the first time that a generative Chatbot is proposed
using the embedded transformer in both generator and discriminator models.
Relying on the parallel computing of the transformer model, the results of the
proposed model on the Cornell Movie-Dialog corpus and the Chit-Chat datasets
confirm the superiority of the proposed model compared to state-of-the-art
alternatives using different evaluation metrics.
Related papers
- Distinguishing Chatbot from Human [1.1249583407496218]
We develop a new dataset consisting of more than 750,000 human-written paragraphs.
Based on this dataset, we apply Machine Learning (ML) techniques to determine the origin of text.
Our proposed solutions offer high classification accuracy and serve as useful tools for textual analysis.
arXiv Detail & Related papers (2024-08-03T13:18:04Z) - Representing Rule-based Chatbots with Transformers [35.30128900987116]
We build on prior work by constructing a Transformer that implements the ELIZA program.
ELIZA illustrates some of the distinctive challenges of the conversational setting.
We train Transformers on a dataset of synthetically generated ELIZA conversations and investigate the mechanisms the models learn.
arXiv Detail & Related papers (2024-07-15T17:45:53Z) - Computational Argumentation-based Chatbots: a Survey [0.4024850952459757]
The present survey sifts through the literature to review papers concerning this kind of argumentation-based bot.
It draws conclusions about the drawbacks and benefits of this approach.
It also envisaging possible future development and integration with the Transformer-based architecture and state-of-the-art Large Language models.
arXiv Detail & Related papers (2024-01-07T11:20:42Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Transformer Based Bengali Chatbot Using General Knowledge Dataset [0.0]
In this research, we applied the transformer model for Bengali general knowledge chatbots based on the Bengali general knowledge Question Answer (QA) dataset.
It scores 85.0 BLEU on the applied QA data. To check the comparison of the transformer model performance, we trained the seq2seq model with attention on our dataset that scores 23.5 BLEU.
arXiv Detail & Related papers (2021-11-06T18:33:20Z) - Sentence Bottleneck Autoencoders from Transformer Language Models [53.350633961266375]
We build a sentence-level autoencoder from a pretrained, frozen transformer language model.
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder.
We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
arXiv Detail & Related papers (2021-08-31T19:39:55Z) - Auto-tagging of Short Conversational Sentences using Natural Language
Processing Methods [0.0]
We manually tagged approximately 14 thousand visitor inputs into ten basic categories.
We considered three different state-of-the-art models and reported their auto-tagging capabilities.
Implementation of the models used in these experiments can be cloned from our GitHub repository and tested for similar auto-tagging problems without much effort.
arXiv Detail & Related papers (2021-06-09T10:14:05Z) - Parameter Efficient Multimodal Transformers for Video Representation
Learning [108.8517364784009]
This work focuses on reducing the parameters of multimodal Transformers in the context of audio-visual video representation learning.
We show that our approach reduces parameters up to 80$%$, allowing us to train our model end-to-end from scratch.
To demonstrate our approach, we pretrain our model on 30-second clips from Kinetics-700 and transfer it to audio-visual classification tasks.
arXiv Detail & Related papers (2020-12-08T00:16:13Z) - Long Range Arena: A Benchmark for Efficient Transformers [115.1654897514089]
Long-rangearena benchmark is a suite of tasks consisting of sequences ranging from $1K$ to $16K$ tokens.
We systematically evaluate ten well-established long-range Transformer models on our newly proposed benchmark suite.
arXiv Detail & Related papers (2020-11-08T15:53:56Z) - Investigation of Sentiment Controllable Chatbot [50.34061353512263]
In this paper, we investigate four models to scale or adjust the sentiment of the response.
The models are a persona-based model, reinforcement learning, a plug and play model, and CycleGAN.
We develop machine-evaluated metrics to estimate whether the responses are reasonable given the input.
arXiv Detail & Related papers (2020-07-11T16:04:30Z) - Variational Transformers for Diverse Response Generation [71.53159402053392]
Variational Transformer (VT) is a variational self-attentive feed-forward sequence model.
VT combines the parallelizability and global receptive field computation of the Transformer with the variational nature of the CVAE.
We explore two types of VT: 1) modeling the discourse-level diversity with a global latent variable; and 2) augmenting the Transformer decoder with a sequence of finegrained latent variables.
arXiv Detail & Related papers (2020-03-28T07:48:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.