Quantifying the redundancy between prosody and text
- URL: http://arxiv.org/abs/2311.17233v1
- Date: Tue, 28 Nov 2023 21:15:24 GMT
- Title: Quantifying the redundancy between prosody and text
- Authors: Lukas Wolf, Tiago Pimentel, Evelina Fedorenko, Ryan Cotterell, Alex
Warstadt, Ethan Wilcox, Tamar Regev
- Abstract summary: We use large language models to estimate how much information is redundant between prosody and the words themselves.
We find a high degree of redundancy between the information carried by the words and prosodic information across several prosodic features.
Still, we observe that prosodic features can not be fully predicted from text, suggesting that prosody carries information above and beyond the words.
- Score: 67.07817268372743
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prosody -- the suprasegmental component of speech, including pitch, loudness,
and tempo -- carries critical aspects of meaning. However, the relationship
between the information conveyed by prosody vs. by the words themselves remains
poorly understood. We use large language models (LLMs) to estimate how much
information is redundant between prosody and the words themselves. Using a
large spoken corpus of English audiobooks, we extract prosodic features aligned
to individual words and test how well they can be predicted from LLM
embeddings, compared to non-contextual word embeddings. We find a high degree
of redundancy between the information carried by the words and prosodic
information across several prosodic features, including intensity, duration,
pauses, and pitch contours. Furthermore, a word's prosodic information is
redundant with both the word itself and the context preceding as well as
following it. Still, we observe that prosodic features can not be fully
predicted from text, suggesting that prosody carries information above and
beyond the words. Along with this paper, we release a general-purpose data
processing pipeline for quantifying the relationship between linguistic
information and extra-linguistic features.
Related papers
- Information Theory of Meaningful Communication [0.0]
In Shannon's seminal paper, entropy of printed English, treated as a stationary process, was estimated to be roughly 1 bit per character.
In this study, we show that one can leverage recently developed large language models to quantify information communicated in meaningful narratives in terms of bits of meaning per clause.
arXiv Detail & Related papers (2024-11-19T18:51:23Z) - Improving Mandarin Prosodic Structure Prediction with Multi-level
Contextual Information [68.89000132126536]
This work proposes to use inter-utterance linguistic information to improve the performance of prosodic structure prediction (PSP)
Our method achieves better F1 scores in predicting prosodic word (PW), prosodic phrase (PPH) and intonational phrase (IPH)
arXiv Detail & Related papers (2023-08-31T09:19:15Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - Evaluation of Automatically Constructed Word Meaning Explanations [0.0]
We present a new tool that derives explanations automatically based on collective information from very large corpora.
We show that the presented approach allows to create explanations that contain data useful for understanding the word meaning in approximately 90% of cases.
arXiv Detail & Related papers (2023-02-27T09:47:55Z) - Pareto Probing: Trading Off Accuracy for Complexity [87.09294772742737]
We argue for a probe metric that reflects the fundamental trade-off between probe complexity and performance.
Our experiments with dependency parsing reveal a wide gap in syntactic knowledge between contextual and non-contextual representations.
arXiv Detail & Related papers (2020-10-05T17:27:31Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z) - Paragraph-level Commonsense Transformers with Recurrent Memory [77.4133779538797]
We train a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives.
Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.
arXiv Detail & Related papers (2020-10-04T05:24:12Z) - Prosody leaks into the memories of words [2.309770674164469]
The average predictability (aka informativity) of a word in context has been shown to condition word duration.
This study extends past work in two directions; it investigated informativity effects in another large language, Mandarin Chinese.
Results indicated that words with low informativity have shorter durations, replicating the effect found in English.
arXiv Detail & Related papers (2020-05-29T17:58:33Z) - Heaps' law and Heaps functions in tagged texts: Evidences of their
linguistic relevance [0.0]
We study the relationship between vocabulary size and text length in a corpus of $75$ literary works in English.
We analyze the progressive appearance of new words of each tag along each individual text.
arXiv Detail & Related papers (2020-01-07T17:05:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.