Related papers: TwIPS: A Large Language Model Powered Texting Application to Simplify Conversational Nuances for Autistic Users

TwIPS: A Large Language Model Powered Texting Application to Simplify Conversational Nuances for Autistic Users

URL: http://arxiv.org/abs/2407.17760v1
Date: Thu, 25 Jul 2024 04:15:54 GMT
Title: TwIPS: A Large Language Model Powered Texting Application to Simplify Conversational Nuances for Autistic Users
Authors: Rukhshan Haroon, Fahad Dogar,
Abstract summary: Autistic individuals often experience difficulties in conveying and interpreting emotional tone and non-literal nuances. We present TwIPS, a prototype texting application powered by a large language model (LLM) We leverage an AI-based simulation and a conversational script to evaluate TwIPS with 8 autistic participants in an in-lab setting.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autistic individuals often experience difficulties in conveying and interpreting emotional tone and non-literal nuances. Many also mask their communication style to avoid being misconstrued by others, spending considerable time and mental effort in the process. To address these challenges in text-based communication, we present TwIPS, a prototype texting application powered by a large language model (LLM), which can assist users with: a) deciphering tone and meaning of incoming messages, b) ensuring the emotional tone of their message is in line with their intent, and c) coming up with alternate phrasing for messages that could be misconstrued and received negatively by others. We leverage an AI-based simulation and a conversational script to evaluate TwIPS with 8 autistic participants in an in-lab setting. Our findings show TwIPS enables a convenient way for participants to seek clarifications, provides a better alternative to tone indicators, and facilitates constructive reflection on writing technique and style. We also examine how autistic users utilize language for self-expression and interpretation in instant messaging, and gather feedback for enhancing our prototype. We conclude with a discussion around balancing user-autonomy with AI-mediation, establishing appropriate trust levels in AI systems, and customization needs if autistic users in the context of AI-assisted communication

Related papers

Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders [10.664605070306417]
We propose a gesture-aware Automatic Speech Recognition (ASR) system with zero-shot learning for individuals with speech impairments. Experiment results and analyses show that including gesture information significantly enhances semantic understanding.
arXiv Detail & Related papers (2025-02-18T14:15:55Z)
Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data [33.85748258158527]
Empathetic dialogue is crucial for natural human-computer interaction. Large language models (LLMs) have revolutionized dialogue generation by harnessing their powerful capabilities. We propose a novel approach that circumvents the need for question-answering data.
arXiv Detail & Related papers (2025-01-19T04:10:53Z)
Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion. We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations. Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z)
Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance. We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information. Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z)
Toward a Dialogue System Using a Large Language Model to Recognize User Emotions with a Camera [0.0]
Methods for AI agents to recognize emotions from the user's facial expressions have not been studied. We examined whether or not LLM-based AI agents can interact with users according to their emotional states by capturing the user in dialogue with a camera. Results confirmed that AI agents can have conversations according to the emotional state for emotional states with relatively high scores, such as Happy and Angry.
arXiv Detail & Related papers (2024-08-15T07:03:00Z)
WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers [11.242099987201573]
Non-native English speakers (NNES) face challenges in digital workspace communication. Current AI-assisted writing tools are equipped with fluency enhancement and rewriting suggestions. We propose WordDecipher, an explainable AI-assisted writing tool to enhance digital workspace communication.
arXiv Detail & Related papers (2024-04-10T13:40:29Z)
Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue [71.15186328127409]
Paralinguistics-enhanced Generative Pretrained Transformer (ParalinGPT) Model takes the conversational context of text, speech embeddings, and paralinguistic attributes as input prompts within a serialized multitasking framework. We utilize the Switchboard-1 corpus, including its sentiment labels as the paralinguistic attribute, as our spoken dialogue dataset.
arXiv Detail & Related papers (2023-12-23T18:14:56Z)
Utilizing Speech Emotion Recognition and Recommender Systems for Negative Emotion Handling in Therapy Chatbots [0.0]
This paper proposes an approach to enhance therapy chatbots with auditory perception, enabling them to understand users' feelings and provide human-like empathy. The proposed method incorporates speech emotion recognition (SER) techniques using CNN models and the ShEMO dataset. To provide a more immersive and empathetic user experience, a text-to-speech model called GlowTTS is integrated.
arXiv Detail & Related papers (2023-11-18T16:35:55Z)
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech [94.64927912924087]
We train TTS systems using real-world speech from YouTube and podcasts. Recent Text-to-Speech architecture is designed for multiple code generation and monotonic alignment. We show thatRecent Text-to-Speech architecture outperforms existing TTS systems in several objective and subjective measures.
arXiv Detail & Related papers (2023-02-08T17:34:32Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities. We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z)
PicTalky: Augmentative and Alternative Communication Software for Language Developmental Disabilities [2.2944351895226953]
Augmentative and alternative communication (AAC) is a practical means of communication for people with language disabilities. We propose PicTalky, which is an AI-based AAC system that helps children with language developmental disabilities to improve their communication skills and language comprehension abilities.
arXiv Detail & Related papers (2021-09-27T10:46:14Z)
Hierarchical Summarization for Longform Spoken Dialog [1.995792341399967]
Despite the pervasiveness of spoken dialog, automated speech understanding and quality information extraction remains markedly poor. Compared to understanding text, auditory communication poses many additional challenges such as speaker disfluencies, informal prose styles, and lack of structure. We propose a two stage ASR and text summarization pipeline and propose a set of semantic segmentation and merging algorithms to resolve these speech modeling challenges.
arXiv Detail & Related papers (2021-08-21T23:31:31Z)
An Attribute-Aligned Strategy for Learning Speech Representation [57.891727280493015]
We propose an attribute-aligned learning strategy to derive speech representation that can flexibly address these issues by attribute-selection mechanism. Specifically, we propose a layered-representation variational autoencoder (LR-VAE), which factorizes speech representation into attribute-sensitive nodes. Our proposed method achieves competitive performances on identity-free SER and a better performance on emotionless SV.
arXiv Detail & Related papers (2021-06-05T06:19:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.