Quality Assurance of Generative Dialog Models in an Evolving
Conversational Agent Used for Swedish Language Practice
- URL: http://arxiv.org/abs/2203.15414v1
- Date: Tue, 29 Mar 2022 10:25:13 GMT
- Title: Quality Assurance of Generative Dialog Models in an Evolving
Conversational Agent Used for Swedish Language Practice
- Authors: Markus Borg and Johan Bengtsson and Harald \"Osterling and Alexander
Hagelborn and Isabella Gagner and Piotr Tomaszewski
- Abstract summary: One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice.
We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews.
- Score: 59.705062519344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the migration megatrend, efficient and effective second-language
acquisition is vital. One proposed solution involves AI-enabled conversational
agents for person-centered interactive language practice. We present results
from ongoing action research targeting quality assurance of proprietary
generative dialog models trained for virtual job interviews. The action team
elicited a set of 38 requirements for which we designed corresponding automated
test cases for 15 of particular interest to the evolving solution. Our results
show that six of the test case designs can detect meaningful differences
between candidate models. While quality assurance of natural language
processing applications is complex, we provide initial steps toward an
automated framework for machine learning model selection in the context of an
evolving conversational agent. Future work will focus on model selection in an
MLOps setting.
Related papers
- Towards Autonomous Agents: Adaptive-planning, Reasoning, and Acting in Language Models [3.8936716676293917]
We propose a novel in-context learning algorithm for building autonomous decision-making language agents.
Our selected language agent demonstrates the ability to solve tasks in a text-based game environment.
arXiv Detail & Related papers (2024-08-12T19:18:05Z) - An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks [3.015760169663536]
We investigate the potential of adapter-based fine-tuning in developing a unified model capable of handling multiple spoken language processing tasks.
We show that adapter-based fine-tuning enables a single encoder-decoder model to perform multiple speech processing tasks with an average improvement of 18.4%.
arXiv Detail & Related papers (2024-06-20T21:39:04Z) - Facilitating Multi-Role and Multi-Behavior Collaboration of Large Language Models for Online Job Seeking and Recruiting [51.54907796704785]
Existing methods rely on modeling the latent semantics of resumes and job descriptions and learning a matching function between them.
Inspired by the powerful role-playing capabilities of Large Language Models (LLMs), we propose to introduce a mock interview process between LLM-played interviewers and candidates.
We propose MockLLM, a novel applicable framework that divides the person-job matching process into two modules: mock interview generation and two-sided evaluation in handshake protocol.
arXiv Detail & Related papers (2024-05-28T12:23:16Z) - Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies.
We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z) - Adapting Task-Oriented Dialogue Models for Email Conversations [4.45709593827781]
In this paper, we provide an effective transfer learning framework (EMToD) that allows the latest development in dialogue models to be adapted for long-form conversations.
We show that the proposed EMToD framework improves intent detection performance over pre-trained language models by 45% and over pre-trained dialogue models by 30% for task-oriented email conversations.
arXiv Detail & Related papers (2022-08-19T16:41:34Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - Do Response Selection Models Really Know What's Next? Utterance
Manipulation Strategies for Multi-turn Response Selection [11.465266718370536]
We study the task of selecting the optimal response given a user and system utterance history in retrieval-based dialog systems.
We propose utterance manipulation strategies (UMS) to address this problem.
UMS consist of several strategies (i.e., insertion, deletion, and search) which aid the response selection model towards maintaining dialog coherence.
arXiv Detail & Related papers (2020-09-10T07:39:05Z) - Modeling Long Context for Task-Oriented Dialogue State Generation [51.044300192906995]
We propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model.
Our approaches attempt to solve the problem that the performance of the baseline significantly drops when the input dialogue context sequence is long.
In our experiments, our proposed model achieves a 7.03% relative improvement over the baseline, establishing a new state-of-the-art joint goal accuracy of 52.04% on the MultiWOZ 2.0 dataset.
arXiv Detail & Related papers (2020-04-29T11:02:25Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.