Probing Language Models' Gesture Understanding for Enhanced Human-AI
Interaction
- URL: http://arxiv.org/abs/2401.17858v1
- Date: Wed, 31 Jan 2024 14:19:03 GMT
- Title: Probing Language Models' Gesture Understanding for Enhanced Human-AI
Interaction
- Authors: Philipp Wicke
- Abstract summary: This project aims to investigate the interaction between Large Language Models and non-verbal communication, specifically focusing on gestures.
The proposal sets out a plan to examine the proficiency of LLMs in deciphering both explicit and implicit non-verbal cues within textual prompts.
To assess LLMs' comprehension of gestures, experiments are planned, evaluating their ability to simulate human behaviour in order to replicate psycholinguistic experiments.
- Score: 6.216023343793143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rise of Large Language Models (LLMs) has affected various disciplines
that got beyond mere text generation. Going beyond their textual nature, this
project proposal aims to investigate the interaction between LLMs and
non-verbal communication, specifically focusing on gestures. The proposal sets
out a plan to examine the proficiency of LLMs in deciphering both explicit and
implicit non-verbal cues within textual prompts and their ability to associate
these gestures with various contextual factors. The research proposes to test
established psycholinguistic study designs to construct a comprehensive dataset
that pairs textual prompts with detailed gesture descriptions, encompassing
diverse regional variations, and semantic labels. To assess LLMs' comprehension
of gestures, experiments are planned, evaluating their ability to simulate
human behaviour in order to replicate psycholinguistic experiments. These
experiments consider cultural dimensions and measure the agreement between
LLM-identified gestures and the dataset, shedding light on the models'
contextual interpretation of non-verbal cues (e.g. gestures).
Related papers
- Investigating Expert-in-the-Loop LLM Discourse Patterns for Ancient Intertextual Analysis [0.0]
The study demonstrates that large language models can detect direct quotations, allusions, and echoes between texts.
The model struggles with long query passages and the inclusion of false intertextual dependences.
The expert-in-the-loop methodology presented offers a scalable approach for intertextual research.
arXiv Detail & Related papers (2024-09-03T13:23:11Z) - LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds.
Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines.
We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z) - Think from Words(TFW): Initiating Human-Like Cognition in Large Language
Models Through Think from Words for Japanese Text-level Classification [0.0]
"Think from Words" (TFW) initiates the comprehension process at the word level and then extends it to encompass the entire text.
"TFW with Extra word-level information" (TFW Extra) augmenting comprehension with additional word-level data.
Our findings shed light on the impact of various word-level information types on LLMs' text comprehension.
arXiv Detail & Related papers (2023-12-06T12:34:46Z) - Addressing the Blind Spots in Spoken Language Processing [4.626189039960495]
We argue that understanding human communication requires a more holistic approach that goes beyond textual or spoken words to include non-verbal elements.
We propose the development of universal automatic gesture segmentation and transcription models to transcribe these non-verbal cues into textual form.
arXiv Detail & Related papers (2023-09-06T10:29:25Z) - AI Text-to-Behavior: A Study In Steerability [0.0]
The research explores the steerability of Large Language Models (LLMs)
We quantitatively gauged the model's responsiveness to tailored prompts using a behavioral psychology framework called OCEAN.
Our findings underscore GPT's versatility and ability to discern and adapt to nuanced instructions.
arXiv Detail & Related papers (2023-08-07T18:14:24Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with
Controllable Perturbations [2.041108289731398]
Recent research has adopted a new experimental field centered around the concept of text perturbations.
Recent research has revealed that shuffled word order has little to no impact on the downstream performance of Transformer-based language models.
arXiv Detail & Related papers (2021-09-28T20:15:29Z) - SPLAT: Speech-Language Joint Pre-Training for Spoken Language
Understanding [61.02342238771685]
Spoken language understanding requires a model to analyze input acoustic signal to understand its linguistic content and make predictions.
Various pre-training methods have been proposed to learn rich representations from large-scale unannotated speech and text.
We propose a novel semi-supervised learning framework, SPLAT, to jointly pre-train the speech and language modules.
arXiv Detail & Related papers (2020-10-05T19:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.