Seeing and hearing what has not been said; A multimodal client behavior
classifier in Motivational Interviewing with interpretable fusion
- URL: http://arxiv.org/abs/2309.14398v2
- Date: Wed, 27 Sep 2023 08:30:20 GMT
- Title: Seeing and hearing what has not been said; A multimodal client behavior
classifier in Motivational Interviewing with interpretable fusion
- Authors: Lucie Galland, Catherine Pelachaud and Florian Pecune
- Abstract summary: Motivational Interviewing (MI) is an approach to therapy that emphasizes collaboration and encourages behavioral change.
To evaluate the quality of an MI conversation, client utterances can be classified using the MISC code as either change talk, sustain talk, or follow/neutral talk.
The proportion of change talk in a MI conversation is positively correlated with therapy outcomes, making accurate classification of client utterances essential.
- Score: 0.8192907805418583
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Motivational Interviewing (MI) is an approach to therapy that emphasizes
collaboration and encourages behavioral change. To evaluate the quality of an
MI conversation, client utterances can be classified using the MISC code as
either change talk, sustain talk, or follow/neutral talk. The proportion of
change talk in a MI conversation is positively correlated with therapy
outcomes, making accurate classification of client utterances essential. In
this paper, we present a classifier that accurately distinguishes between the
three MISC classes (change talk, sustain talk, and follow/neutral talk)
leveraging multimodal features such as text, prosody, facial expressivity, and
body expressivity. To train our model, we perform annotations on the publicly
available AnnoMI dataset to collect multimodal information, including text,
audio, facial expressivity, and body expressivity. Furthermore, we identify the
most important modalities in the decision-making process, providing valuable
insights into the interplay of different modalities during a MI conversation.
Related papers
- Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? [64.72966061510375]
Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue.
This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis.
We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
arXiv Detail & Related papers (2024-06-16T20:41:44Z) - M3TCM: Multi-modal Multi-task Context Model for Utterance Classification in Motivational Interviews [1.8100046713740954]
We present M3TCM, a Multi-modal, Multi-task Context Model for utterance classification.
Our approach for the first time employs multi-task learning to effectively model both joint and individual components of therapist and client behaviour.
With our novel approach, we outperform the state of the art for utterance classification on the recently introduced AnnoMI dataset with a relative improvement of 20% for the client- and by 15% for therapist utterance classification.
arXiv Detail & Related papers (2024-04-04T09:17:22Z) - Emotional Listener Portrait: Realistic Listener Motion Simulation in
Conversation [50.35367785674921]
Listener head generation centers on generating non-verbal behaviors of a listener in reference to the information delivered by a speaker.
A significant challenge when generating such responses is the non-deterministic nature of fine-grained facial expressions during a conversation.
We propose the Emotional Listener Portrait (ELP), which treats each fine-grained facial motion as a composition of several discrete motion-codewords.
Our ELP model can not only automatically generate natural and diverse responses toward a given speaker via sampling from the learned distribution but also generate controllable responses with a predetermined attitude.
arXiv Detail & Related papers (2023-09-29T18:18:32Z) - MPCHAT: Towards Multimodal Persona-Grounded Conversation [54.800425322314105]
We extend persona-based dialogue to the multimodal domain and make two main contributions.
First, we present the first multimodal persona-based dialogue dataset named MPCHAT.
Second, we empirically show that incorporating multimodal persona, as measured by three proposed multimodal persona-grounded dialogue tasks, leads to statistically significant performance improvements.
arXiv Detail & Related papers (2023-05-27T06:46:42Z) - Coreference-aware Double-channel Attention Network for Multi-party
Dialogue Reading Comprehension [7.353227696624305]
We tackle Multi-party Dialogue Reading (abbr., MDRC)
MDRC stands for an extractive reading comprehension task grounded on a batch of dialogues among multiple interlocutors.
We propose a coreference-aware attention modeling method to strengthen the reasoning ability.
arXiv Detail & Related papers (2023-05-15T05:01:29Z) - Context-Dependent Embedding Utterance Representations for Emotion
Recognition in Conversations [1.8126187844654875]
We approach Emotion Recognition in Conversations leveraging the conversational context.
We propose context-dependent embedding representations of each utterance.
The effectiveness of our approach is validated on the open-domain DailyDialog dataset and on the task-oriented EmoWOZ dataset.
arXiv Detail & Related papers (2023-04-17T12:37:57Z) - deep learning of segment-level feature representation for speech emotion
recognition in conversations [9.432208348863336]
We propose a conversational speech emotion recognition method to deal with capturing attentive contextual dependency and speaker-sensitive interactions.
First, we use a pretrained VGGish model to extract segment-based audio representation in individual utterances.
Second, an attentive bi-directional recurrent unit (GRU) models contextual-sensitive information and explores intra- and inter-speaker dependencies jointly.
arXiv Detail & Related papers (2023-02-05T16:15:46Z) - Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion [89.01668641930206]
We present a framework for modeling interactional communication in dyadic conversations.
We autoregressively output multiple possibilities of corresponding listener motion.
Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions.
arXiv Detail & Related papers (2022-04-18T17:58:04Z) - Dialogue History Matters! Personalized Response Selectionin Multi-turn
Retrieval-based Chatbots [62.295373408415365]
We propose a personalized hybrid matching network (PHMN) for context-response matching.
Our contributions are two-fold: 1) our model extracts personalized wording behaviors from user-specific dialogue history as extra matching information.
We evaluate our model on two large datasets with user identification, i.e., personalized dialogue Corpus Ubuntu (P- Ubuntu) and personalized Weibo dataset (P-Weibo)
arXiv Detail & Related papers (2021-03-17T09:42:11Z) - A General Model of Conversational Dynamics and an Example Application in
Serious Illness Communication [0.0]
We describe COnversational DYnamics Model (CODYM) analysis, a novel approach for studying patterns of information flow in conversations.
CODYMs are Markov Models that capture sequential dependencies in the lengths of speaker turns.
As an important first application, we demonstrate the model on transcribed conversations between palliative care clinicians and seriously ill patients.
arXiv Detail & Related papers (2020-10-11T04:33:03Z) - Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue
Representation Learning [50.5572111079898]
Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc.
While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive.
In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks.
arXiv Detail & Related papers (2020-02-27T04:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.