Related papers: Development and Validation of Engagement and Rapport Scales for Evaluating User Experience in Multimodal Dialogue Systems

Development and Validation of Engagement and Rapport Scales for Evaluating User Experience in Multimodal Dialogue Systems

URL: http://arxiv.org/abs/2505.17075v1
Date: Tue, 20 May 2025 05:19:28 GMT
Title: Development and Validation of Engagement and Rapport Scales for Evaluating User Experience in Multimodal Dialogue Systems
Authors: Fuma Kurata, Mao Saeki, Masaki Eguchi, Shungo Suzuki, Hiroaki Takatsu, Yoichi Matsuyama,
Abstract summary: The scales were designed based on theories of engagement in educational psychology, social psychology, and second language acquisition.<n>Seventy-four Japanese learners of English completed roleplay and discussion tasks with trained human tutors and a dialog agent.
Score: 1.4953643992734462
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study aimed to develop and validate two scales of engagement and rapport to evaluate the user experience quality with multimodal dialogue systems in the context of foreign language learning. The scales were designed based on theories of engagement in educational psychology, social psychology, and second language acquisition.Seventy-four Japanese learners of English completed roleplay and discussion tasks with trained human tutors and a dialog agent. After each dialogic task was completed, they responded to the scales of engagement and rapport. The validity and reliability of the scales were investigated through two analyses. We first conducted analysis of Cronbach's alpha coefficient and a series of confirmatory factor analyses to test the structural validity of the scales and the reliability of our designed items. We then compared the scores of engagement and rapport between the dialogue with human tutors and the one with a dialogue agent. The results revealed that our scales succeeded in capturing the difference in the dialogue experience quality between the human interlocutors and the dialogue agent from multiple perspectives.

Related papers

Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations [22.56326809612278]
We present an evaluation framework for interactive dialogue assessment in the context of English as a Second Language speakers.<n>Our framework collects dialogue-level interactivity labels and micro-level span features.<n>We study how the micro-level features influence the (higher level) interactivity quality of ESL dialogues by constructing various machine learning-based models.
arXiv Detail & Related papers (2024-07-09T00:56:59Z)
Rapport-Driven Virtual Agent: Rapport Building Dialogue Strategy for Improving User Experience at First Meeting [3.059886686838972]
This study aims to establish human-agent rapport through small talk by using a rapport-building strategy. We implemented this strategy for the virtual agents based on dialogue strategies by prompting a large language model (LLM)
arXiv Detail & Related papers (2024-06-14T08:47:15Z)
Learning to Memorize Entailment and Discourse Relations for Persona-Consistent Dialogues [8.652711997920463]
Existing works have improved the performance of dialogue systems by intentionally learning interlocutor personas with sophisticated network structures. This study proposes a method of learning to memorize entailment and discourse relations for persona-consistent dialogue tasks.
arXiv Detail & Related papers (2023-01-12T08:37:00Z)
A Benchmark for Understanding and Generating Dialogue between Characters in Stories [75.29466820496913]
We present the first study to explore whether machines can understand and generate dialogue in stories. We propose two new tasks including Masked Dialogue Generation and Dialogue Speaker Recognition. We show the difficulty of the proposed tasks by testing existing models with automatic and manual evaluation on DialStory.
arXiv Detail & Related papers (2022-09-18T10:19:04Z)
What Went Wrong? Explaining Overall Dialogue Quality through Utterance-Level Impacts [15.018259942339448]
This paper presents a novel approach to automated analysis of conversation logs that learns the relationship between user-system interactions and overall dialogue quality. Our approach learns the impact of each interaction from the overall user rating without utterance-level annotation. Experiments show that the automated analysis from our model agrees with expert judgments, making this work the first to show that such weakly-supervised learning of utterance-level quality prediction is highly achievable.
arXiv Detail & Related papers (2021-10-31T19:12:29Z)
Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features. To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives. Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z)
Is this Dialogue Coherent? Learning from Dialogue Acts and Entities [82.44143808977209]
We create the Switchboard Coherence (SWBD-Coh) corpus, a dataset of human-human spoken dialogues annotated with turn coherence ratings. Our statistical analysis of the corpus indicates how turn coherence perception is affected by patterns of distribution of entities. We find that models combining both DA and entity information yield the best performances both for response selection and turn coherence rating.
arXiv Detail & Related papers (2020-06-17T21:02:40Z)
Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data. Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)
Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical Analysis of System-wise Evaluation [114.48767388174218]
This paper presents an empirical analysis on different types of dialog systems composed of different modules in different settings. Our results show that a pipeline dialog system trained using fine-grained supervision signals at different component levels often obtains better performance than the systems that use joint or end-to-end models trained on coarse-grained labels.
arXiv Detail & Related papers (2020-05-15T05:20:06Z)
Dialogue-Based Relation Extraction [53.2896545819799]
We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE. We argue that speaker-related information plays a critical role in the proposed task, based on an analysis of similarities and differences between dialogue-based and traditional RE tasks. Experimental results demonstrate that a speaker-aware extension on the best-performing model leads to gains in both the standard and conversational evaluation settings.
arXiv Detail & Related papers (2020-04-17T03:51:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.