Test Case Generation for Dialogflow Task-Based Chatbots
- URL: http://arxiv.org/abs/2503.05561v1
- Date: Fri, 07 Mar 2025 16:39:27 GMT
- Title: Test Case Generation for Dialogflow Task-Based Chatbots
- Authors: Rocco Gianni Rapisarda, Davide Ginelli, Diego Clerissi, Leonardo Mariani,
- Abstract summary: Test Generator (CTG) is an automated testing technique designed for task-based chatbots.<n>We conducted an experiment comparing CTG with state-of-the-art BOTIUM and CHARM tools.<n>CTG outperformed the competitors in terms of robustness and effectiveness.
- Score: 3.488620810035772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chatbots are software typically embedded in Web and Mobile applications designed to assist the user in a plethora of activities, from chit-chatting to task completion. They enable diverse forms of interactions, like text and voice commands. As any software, even chatbots are susceptible to bugs, and their pervasiveness in our lives, as well as the underlying technological advancements, call for tailored quality assurance techniques. However, test case generation techniques for conversational chatbots are still limited. In this paper, we present Chatbot Test Generator (CTG), an automated testing technique designed for task-based chatbots. We conducted an experiment comparing CTG with state-of-the-art BOTIUM and CHARM tools with seven chatbots, observing that the test cases generated by CTG outperformed the competitors, in terms of robustness and effectiveness.
Related papers
- Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction [1.937324318931008]
This work proposes a Sequence-to-Sequence (Seq2Seq) model with an encoder-decoder architecture that incorporates attention mechanisms and Long Short-Term Memory (LSTM) cells.
The proposed Seq2Seq model-based robot is trained, validated, and tested on a dataset specifically for the tourism sector in Draa-Tafilalet, Morocco.
arXiv Detail & Related papers (2024-12-27T23:50:54Z) - Measuring and Controlling Instruction (In)Stability in Language Model Dialogs [72.38330196290119]
System-prompting is a tool for customizing language-model chatbots, enabling them to follow a specific instruction.
We propose a benchmark to test the assumption, evaluating instruction stability via self-chats.
We reveal a significant instruction drift within eight rounds of conversations.
We propose a lightweight method called split-softmax, which compares favorably against two strong baselines.
arXiv Detail & Related papers (2024-02-13T20:10:29Z) - MutaBot: A Mutation Testing Approach for Chatbots [3.811067614153878]
MutaBot addresses mutations at multiple levels, including conversational flows, intents, and contexts.
We assess the tool with three Dialogflow chatbots and test cases generated with Botium, revealing weaknesses in the test suites.
arXiv Detail & Related papers (2024-01-18T20:38:27Z) - Evaluating Chatbots to Promote Users' Trust -- Practices and Open
Problems [11.427175278545517]
This paper reviews current practices for testing chatbots.
It identifies gaps as open problems in pursuit of user trust.
It outlines a path forward to mitigate issues of trust related to service or product performance, user satisfaction and long-term unintended consequences for society.
arXiv Detail & Related papers (2023-09-09T22:40:30Z) - Can a Chatbot Support Exploratory Software Testing? Preliminary Results [0.9249657468385781]
exploratory testing is the de facto approach in agile teams.
This paper presents BotExpTest, designed to support testers while performing exploratory tests of software applications.
We implemented BotExpTest on top of the instant messaging social platform Discord.
Preliminary analyses indicate that BotExpTest may be as effective as similar approaches and help testers to uncover different bugs.
arXiv Detail & Related papers (2023-07-11T21:11:21Z) - InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT
Beyond Language [82.92236977726655]
InternGPT stands for textbfinteraction, textbfnonverbal, and textbfchatbots.
We present an interactive visual framework named InternGPT, or iGPT for short.
arXiv Detail & Related papers (2023-05-09T17:58:34Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - CheerBots: Chatbots toward Empathy and Emotionusing Reinforcement
Learning [60.348822346249854]
This study presents a framework whereby several empathetic chatbots are based on understanding users' implied feelings and replying empathetically for multiple dialogue turns.
We call these chatbots CheerBots. CheerBots can be retrieval-based or generative-based and were finetuned by deep reinforcement learning.
To respond in an empathetic way, we develop a simulating agent, a Conceptual Human Model, as aids for CheerBots in training with considerations on changes in user's emotional states in the future to arouse sympathy.
arXiv Detail & Related papers (2021-10-08T07:44:47Z) - Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn
Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions.
Our framework included a guiding robot and an interlocutor model that plays the role of humans.
We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z) - If I Hear You Correctly: Building and Evaluating Interview Chatbots with
Active Listening Skills [4.395837214164745]
It is challenging to build effective interview chatbots that can handle user free-text responses to open-ended questions.
We are investigating the feasibility and effectiveness of using publicly available, practical AI technologies to build effective interview chatbots.
arXiv Detail & Related papers (2020-02-05T16:52:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.