Related papers: A Financial Service Chatbot based on Deep Bidirectional Transformers

A Financial Service Chatbot based on Deep Bidirectional Transformers

URL: http://arxiv.org/abs/2003.04987v1
Date: Mon, 17 Feb 2020 18:48:55 GMT
Title: A Financial Service Chatbot based on Deep Bidirectional Transformers
Authors: Shi Yu, Yuxin Chen, Hussain Zaidi
Abstract summary: We use Deep Bidirectional Transformer models (BERT) to handle client questions in financial investment customer service. The bot can recognize 381 intents, and decides when to say "I don't know" and escalates irrelevant/uncertain questions to human operators. Another novel contribution is the usage of BERT as a language model in automatic spelling correction.
Score: 17.779997116217363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop a chatbot using Deep Bidirectional Transformer models (BERT) to handle client questions in financial investment customer service. The bot can recognize 381 intents, and decides when to say "I don't know" and escalates irrelevant/uncertain questions to human operators. Our main novel contribution is the discussion about uncertainty measure for BERT, where three different approaches are systematically compared on real problems. We investigated two uncertainty metrics, information entropy and variance of dropout sampling in BERT, followed by mixed-integer programming to optimize decision thresholds. Another novel contribution is the usage of BERT as a language model in automatic spelling correction. Inputs with accidental spelling errors can significantly decrease intent classification performance. The proposed approach combines probabilities from masked language model and word edit distances to find the best corrections for misspelled words. The chatbot and the entire conversational AI system are developed using open-source tools, and deployed within our company's intranet. The proposed approach can be useful for industries seeking similar in-house solutions in their specific business domains. We share all our code and a sample chatbot built on a public dataset on Github.

Related papers

Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction [1.937324318931008]
This work proposes a Sequence-to-Sequence (Seq2Seq) model with an encoder-decoder architecture that incorporates attention mechanisms and Long Short-Term Memory (LSTM) cells. The proposed Seq2Seq model-based robot is trained, validated, and tested on a dataset specifically for the tourism sector in Draa-Tafilalet, Morocco.
arXiv Detail & Related papers (2024-12-27T23:50:54Z)
Automatically generating decision-support chatbots based on DMN models [0.0]
We propose an approach for the automatic generation of fully functional, ready-to-use decisions-support chatbots based on a DNM decision model. With the aim of reducing chatbots development time and to allowing non-technical users the possibility of developing chatbots specific to their domain, all necessary phases were implemented in the Demabot tool.
arXiv Detail & Related papers (2024-05-15T18:13:09Z)
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs [72.38330196290119]
System-prompting is a tool for customizing language-model chatbots, enabling them to follow a specific instruction. We propose a benchmark to test the assumption, evaluating instruction stability via self-chats. We reveal a significant instruction drift within eight rounds of conversations. We propose a lightweight method called split-softmax, which compares favorably against two strong baselines.
arXiv Detail & Related papers (2024-02-13T20:10:29Z)
Deep Learning Based Amharic Chatbot for FAQs in Universities [0.0]
This paper proposes a model that answers frequently asked questions (FAQs) in the Amharic language. The proposed program employs tokenization, stop word removal, and stemming to analyze and categorize Amharic input sentences. The model was integrated with Facebook Messenger and deployed on a Heroku server for 24-hour accessibility.
arXiv Detail & Related papers (2024-01-26T18:37:21Z)
Retrieval and Generative Approaches for a Pregnancy Chatbot in Nepali with Stemmed and Non-Stemmed Data : A Comparative Study [0.0]
The performance of datasets in Nepali language has been analyzed for each approach. BERT-based pre-trained models perform well on non-stemmed data whereas scratch transformer models have better performance on stemmed data.
arXiv Detail & Related papers (2023-11-12T17:16:46Z)
One System to Rule them All: a Universal Intent Recognition System for Customer Service Chatbots [0.0]
We propose the development of a universal intent recognition system. It is trained to recognize a selected group of 11 intents common in 28 different chatbots. The proposed system is able to discriminate between those universal intents with a balanced accuracy up to 80.4%.
arXiv Detail & Related papers (2021-12-15T16:45:55Z)
Error Detection in Large-Scale Natural Language Understanding Systems Using Transformer Models [0.0]
Large-scale conversational assistants like Alexa, Siri, Cortana and Google Assistant process every utterance using multiple models for domain, intent and named entity recognition. We address this challenge to detect domain classification errors using offline Transformer models. We combine utterance encodings from a RoBERTa model with the Nbest hypothesis produced by the production system. We then fine-tune end-to-end in a multitask setting using a small dataset of humanannotated utterances with domain classification errors.
arXiv Detail & Related papers (2021-09-04T00:10:48Z)
Bayesian Transformer Language Models for Speech Recognition [59.235405107295655]
State-of-the-art neural language models (LMs) represented by Transformers are highly complex. This paper proposes a full Bayesian learning framework for Transformer LM estimation.
arXiv Detail & Related papers (2021-02-09T10:55:27Z)
Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems [65.48663492703557]
We show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. We introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset.
arXiv Detail & Related papers (2020-11-03T14:06:10Z)
Detection of Novel Social Bots by Ensembles of Specialized Classifiers [60.63582690037839]
Malicious actors create inauthentic social media accounts controlled in part by algorithms, known as social bots, to disseminate misinformation and agitate online discussion. We show that different types of bots are characterized by different behavioral features. We propose a new supervised learning method that trains classifiers specialized for each class of bots and combines their decisions through the maximum rule.
arXiv Detail & Related papers (2020-06-11T22:59:59Z)
On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z)
The Right Tool for the Job: Matching Model and Instance Complexities [62.95183777679024]
As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. We propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) "exit" We test our proposed modification on five different datasets in two tasks: three text classification datasets and two natural language inference benchmarks.
arXiv Detail & Related papers (2020-04-16T04:28:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.