The time scale of redundancy between prosody and linguistic context
- URL: http://arxiv.org/abs/2503.11630v2
- Date: Tue, 18 Mar 2025 20:01:03 GMT
- Title: The time scale of redundancy between prosody and linguistic context
- Authors: Tamar I. Regev, Chiebuka Ohams, Shaylee Xie, Lukas Wolf, Evelina Fedorenko, Alex Warstadt, Ethan G. Wilcox, Tiago Pimentel,
- Abstract summary: Previous studies have shown that prosodic features exhibit significant redundancy with both past and future words.<n>We find that this scale differs for past and future words.<n>Our results highlight the role of prosody in shaping efficient communication.
- Score: 22.04241078302997
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In spoken language, speakers transmit information not only using words, but also via a rich array of non-verbal signals, which include prosody -- the auditory features of speech. However, previous studies have shown that prosodic features exhibit significant redundancy with both past and future words. Here, we examine the time scale of this relationship: How many words in the past (or future) contribute to predicting prosody? We find that this scale differs for past and future words. Prosody's redundancy with past words extends across approximately 3-8 words, whereas redundancy with future words is limited to just 1-2 words. These findings indicate that the prosody-future relationship reflects local word dependencies or short-scale processes such as next word prediction, while the prosody-past relationship unfolds over a longer time scale. The latter suggests that prosody serves to emphasize earlier information that may be challenging for listeners to process given limited cognitive resources in real-time communication. Our results highlight the role of prosody in shaping efficient communication.
Related papers
- Speech perception: a model of word recognition [0.0]
We present a model of speech perception which takes into account effects of correlations between sounds.
Words in this model correspond to the attractors of a suitably chosen descent dynamics.
We examine the decryption of short and long words in the presence of mishearings.
arXiv Detail & Related papers (2024-10-24T09:41:47Z) - Quantifying the redundancy between prosody and text [67.07817268372743]
We use large language models to estimate how much information is redundant between prosody and the words themselves.
We find a high degree of redundancy between the information carried by the words and prosodic information across several prosodic features.
Still, we observe that prosodic features can not be fully predicted from text, suggesting that prosody carries information above and beyond the words.
arXiv Detail & Related papers (2023-11-28T21:15:24Z) - Exploring Speech Recognition, Translation, and Understanding with
Discrete Speech Units: A Comparative Study [68.88536866933038]
Speech signals, typically sampled at rates in the tens of thousands per second, contain redundancies.
Recent investigations proposed the use of discrete speech units derived from self-supervised learning representations.
Applying various methods, such as de-duplication and subword modeling, can further compress the speech sequence length.
arXiv Detail & Related papers (2023-09-27T17:21:13Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - Short-Term Word-Learning in a Dynamically Changing Environment [63.025297637716534]
We show how to supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
We demonstrate significant improvements in the detection rate of new words with only a minor increase in false alarms.
arXiv Detail & Related papers (2022-03-29T10:05:39Z) - Disambiguatory Signals are Stronger in Word-initial Positions [48.18148856974974]
We point out the confounds in existing methods for comparing the informativeness of segments early in the word versus later in the word.
We find evidence across hundreds of languages that indeed there is a cross-linguistic tendency to front-load information in words.
arXiv Detail & Related papers (2021-02-03T18:19:16Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z) - Prosody leaks into the memories of words [2.309770674164469]
The average predictability (aka informativity) of a word in context has been shown to condition word duration.
This study extends past work in two directions; it investigated informativity effects in another large language, Mandarin Chinese.
Results indicated that words with low informativity have shorter durations, replicating the effect found in English.
arXiv Detail & Related papers (2020-05-29T17:58:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.