Addressing the Blind Spots in Spoken Language Processing
- URL: http://arxiv.org/abs/2309.06572v1
- Date: Wed, 6 Sep 2023 10:29:25 GMT
- Title: Addressing the Blind Spots in Spoken Language Processing
- Authors: Amit Moryossef
- Abstract summary: We argue that understanding human communication requires a more holistic approach that goes beyond textual or spoken words to include non-verbal elements.
We propose the development of universal automatic gesture segmentation and transcription models to transcribe these non-verbal cues into textual form.
- Score: 4.626189039960495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores the critical but often overlooked role of non-verbal
cues, including co-speech gestures and facial expressions, in human
communication and their implications for Natural Language Processing (NLP). We
argue that understanding human communication requires a more holistic approach
that goes beyond textual or spoken words to include non-verbal elements.
Borrowing from advances in sign language processing, we propose the development
of universal automatic gesture segmentation and transcription models to
transcribe these non-verbal cues into textual form. Such a methodology aims to
bridge the blind spots in spoken language understanding, enhancing the scope
and applicability of NLP models. Through motivating examples, we demonstrate
the limitations of relying solely on text-based models. We propose a
computationally efficient and flexible approach for incorporating non-verbal
cues, which can seamlessly integrate with existing NLP pipelines. We conclude
by calling upon the research community to contribute to the development of
universal transcription methods and to validate their effectiveness in
capturing the complexities of real-world, multi-modal interactions.
Related papers
- Nonverbal Interaction Detection [83.40522919429337]
This work addresses a new challenge of understanding human nonverbal interaction in social contexts.
We contribute a novel large-scale dataset, called NVI, which is meticulously annotated to include bounding boxes for humans and corresponding social groups.
Second, we establish a new task NVI-DET for nonverbal interaction detection, which is formalized as identifying triplets in the form individual, group, interaction> from images.
Third, we propose a nonverbal interaction detection hypergraph (NVI-DEHR), a new approach that explicitly models high-order nonverbal interactions using hypergraphs.
arXiv Detail & Related papers (2024-07-11T02:14:06Z) - Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? [64.72966061510375]
Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue.
This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis.
We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
arXiv Detail & Related papers (2024-06-16T20:41:44Z) - Interpretability of Language Models via Task Spaces [14.543168558734001]
We present an alternative approach to interpret language models (LMs)
We focus on the quality of LM processing, with a focus on their language abilities.
We construct 'linguistic task spaces' that shed light on the connections LMs draw between language phenomena.
arXiv Detail & Related papers (2024-06-10T16:34:30Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - On the Role of Emergent Communication for Social Learning in Multi-Agent
Reinforcement Learning [0.0]
Social learning uses cues from experts to align heterogeneous policies, reduce sample complexity, and solve partially observable tasks.
This paper proposes an unsupervised method based on the information bottleneck to capture both referential complexity and task-specific utility.
arXiv Detail & Related papers (2023-02-28T03:23:27Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Color Overmodification Emerges from Data-Driven Learning and Pragmatic
Reasoning [53.088796874029974]
We show that speakers' referential expressions depart from communicative ideals in ways that help illuminate the nature of pragmatic language use.
By adopting neural networks as learning agents, we show that overmodification is more likely with environmental features that are infrequent or salient.
arXiv Detail & Related papers (2022-05-18T18:42:43Z) - Bridging between Cognitive Processing Signals and Linguistic Features
via a Unified Attentional Network [25.235060468310696]
We propose a data-driven method to investigate the relationship between cognitive processing signals and linguistic features.
We present a unified attentional framework that is composed of embedding, attention, encoding and predicting layers.
The proposed framework can be used to detect a wide range of linguistic features with a single cognitive dataset.
arXiv Detail & Related papers (2021-12-16T12:25:11Z) - Towards Transparent Interactive Semantic Parsing via Step-by-Step
Correction [17.000283696243564]
We investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language.
We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework.
Our experiments show that the interactive framework with human feedback has the potential to greatly improve overall parse accuracy.
arXiv Detail & Related papers (2021-10-15T20:11:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.