Using Large Language Models to Accelerate Communication for Users with
Severe Motor Impairments
- URL: http://arxiv.org/abs/2312.01532v1
- Date: Sun, 3 Dec 2023 23:12:49 GMT
- Title: Using Large Language Models to Accelerate Communication for Users with
Severe Motor Impairments
- Authors: Shanqing Cai, Subhashini Venugopalan, Katie Seaver, Xiang Xiao, Katrin
Tomanek, Sri Jalasutram, Meredith Ringel Morris, Shaun Kane, Ajit Narayanan,
Robert L. MacDonald, Emily Kornman, Daniel Vance, Blair Casey, Steve M.
Gleason, Philip Q. Nelson, Michael P. Brenner
- Abstract summary: We present SpeakFaster, consisting of large language models (LLMs) and a co-designed user interface for text entry in a highly-abbreviated form.
Pilot study with 19 non-AAC participants typing on a mobile device by hand demonstrated gains in motor savings in line with the offline simulation.
Lab and field testing on two eye-gaze typing users with amyotrophic lateral sclerosis (ALS) demonstrated text-entry rates 29-60% faster than traditional baselines.
- Score: 17.715162857028595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding ways to accelerate text input for individuals with profound motor
impairments has been a long-standing area of research. Closing the speed gap
for augmentative and alternative communication (AAC) devices such as
eye-tracking keyboards is important for improving the quality of life for such
individuals. Recent advances in neural networks of natural language pose new
opportunities for re-thinking strategies and user interfaces for enhanced
text-entry for AAC users. In this paper, we present SpeakFaster, consisting of
large language models (LLMs) and a co-designed user interface for text entry in
a highly-abbreviated form, allowing saving 57% more motor actions than
traditional predictive keyboards in offline simulation. A pilot study with 19
non-AAC participants typing on a mobile device by hand demonstrated gains in
motor savings in line with the offline simulation, while introducing relatively
small effects on overall typing speed. Lab and field testing on two eye-gaze
typing users with amyotrophic lateral sclerosis (ALS) demonstrated text-entry
rates 29-60% faster than traditional baselines, due to significant saving of
expensive keystrokes achieved through phrase and word predictions from
context-aware LLMs. These findings provide a strong foundation for further
exploration of substantially-accelerated text communication for motor-impaired
users and demonstrate a direction for applying LLMs to text-based user
interfaces.
Related papers
- Exploring Mobile Touch Interaction with Large Language Models [26.599610206222142]
We propose to control Large Language Models via touch gestures performed directly on the text.
Results demonstrate that touch-based control of LLMs is both feasible and user-friendly.
This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.
arXiv Detail & Related papers (2025-02-11T15:17:00Z) - Efficient Driving Behavior Narration and Reasoning on Edge Device Using Large Language Models [16.532357621144342]
Large language models (LLMs) can describe driving scenes and behaviors with a level of accuracy similar to human perception.
We propose a driving behavior narration and reasoning framework that applies LLMs to edge devices.
Our experiments show that LLMs deployed on edge devices can achieve satisfactory response speeds.
arXiv Detail & Related papers (2024-09-30T15:03:55Z) - Enabling Real-Time Conversations with Minimal Training Costs [61.80370154101649]
This paper presents a new duplex decoding approach that enhances large language models with duplex ability, requiring minimal training.
Experimental results indicate that our proposed method significantly enhances the naturalness and human-likeness of user-AI interactions with minimal training costs.
arXiv Detail & Related papers (2024-09-18T06:27:26Z) - Modulating Language Model Experiences through Frictions [56.17593192325438]
Over-consumption of language model outputs risks propagating unchecked errors in the short-term and damaging human capabilities for critical thinking in the long-term.
We propose selective frictions for language model experiences, inspired by behavioral science interventions, to dampen misuse.
arXiv Detail & Related papers (2024-06-24T16:31:11Z) - Learning Generalizable Human Motion Generator with Reinforcement Learning [95.62084727984808]
Text-driven human motion generation is one of the vital tasks in computer-aided content creation.
Existing methods often overfit specific motion expressions in the training data, hindering their ability to generalize.
We present textbfInstructMotion, which incorporate the trail and error paradigm in reinforcement learning for generalizable human motion generation.
arXiv Detail & Related papers (2024-05-24T13:29:12Z) - Embedded Named Entity Recognition using Probing Classifiers [10.573861741540853]
EMBER enables streaming named entity recognition in decoder-only language models without fine-tuning them.
We show that EMBER maintains high token generation rates, with only a negligible decrease in speed of around 1%.
We make our code and data available online, including a toolkit for training, testing, and deploying efficient token classification models.
arXiv Detail & Related papers (2024-03-18T12:58:16Z) - TLControl: Trajectory and Language Control for Human Motion Synthesis [68.09806223962323]
We present TLControl, a novel method for realistic human motion synthesis.
It incorporates both low-level Trajectory and high-level Language semantics controls.
It is practical for interactive and high-quality animation generation.
arXiv Detail & Related papers (2023-11-28T18:54:16Z) - Dialogue-based generation of self-driving simulation scenarios using
Large Language Models [14.86435467709869]
Simulation is an invaluable tool for developing and evaluating controllers for self-driving cars.
Current simulation frameworks are driven by highly-specialist domain specific languages.
There is often a gap between a concise English utterance and the executable code that captures the user's intent.
arXiv Detail & Related papers (2023-10-26T13:07:01Z) - Typing on Any Surface: A Deep Learning-based Method for Real-Time
Keystroke Detection in Augmented Reality [4.857109990499532]
Mid-air keyboard interface, wireless keyboards or voice input, either suffer from poor ergonomic design, limited accuracy, or are simply embarrassing to use in public.
This paper proposes and validates a deep-learning based approach, that enables AR applications to accurately predict keystrokes from the user perspective RGB video stream.
A two-stage model, combing an off-the-shelf hand landmark extractor and a novel adaptive Convolutional Recurrent Neural Network (C-RNN) was trained.
arXiv Detail & Related papers (2023-08-31T23:58:25Z) - Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal
Grounding [78.71529237748018]
Grounding temporal video segments described in natural language queries effectively and efficiently is a crucial capability needed in vision-and-language fields.
Most existing approaches adopt elaborately designed cross-modal interaction modules to improve the grounding performance.
We propose a commonsense-aware cross-modal alignment framework, which incorporates commonsense-guided visual and text representations into a complementary common space.
arXiv Detail & Related papers (2022-04-04T13:07:05Z) - X2T: Training an X-to-Text Typing Interface with Online Learning from
User Feedback [83.95599156217945]
We focus on assistive typing applications in which a user cannot operate a keyboard, but can supply other inputs.
Standard methods train a model on a fixed dataset of user inputs, then deploy a static interface that does not learn from its mistakes.
We investigate a simple idea that would enable such interfaces to improve over time, with minimal additional effort from the user.
arXiv Detail & Related papers (2022-03-04T00:07:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.