Related papers: Form and meaning co-determine the realization of tone in Taiwan Mandarin spontaneous speech: the case of Tone 3 sandhi

Form and meaning co-determine the realization of tone in Taiwan Mandarin spontaneous speech: the case of Tone 3 sandhi

URL: http://arxiv.org/abs/2408.15747v1
Date: Wed, 28 Aug 2024 12:25:45 GMT
Title: Form and meaning co-determine the realization of tone in Taiwan Mandarin spontaneous speech: the case of Tone 3 sandhi
Authors: Yuxin Lu, Yu-Ying Chuang, R. Harald Baayen,
Abstract summary: In Standard Chinese, Tone 3 (the dipping tone) becomes Tone 2 (rising tone) when followed by another Tone 3. Previous studies have noted that this sandhi process may be incomplete, in the sense that the assimilated Tone 3 is still distinct from a true Tone 2. The present study investigates the pitch contours of two-character words with T2-T3 and T3-T3 tone patterns in spontaneous Taiwan Mandarin conversations.
Score: 1.7723990552388866
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In Standard Chinese, Tone 3 (the dipping tone) becomes Tone 2 (rising tone) when followed by another Tone 3. Previous studies have noted that this sandhi process may be incomplete, in the sense that the assimilated Tone 3 is still distinct from a true Tone 2. While Mandarin Tone 3 sandhi is widely studied using carefully controlled laboratory speech (Xu, 1997) and more formal registers of Beijing Mandarin (Yuan and Chen, 2014), less is known about its realization in spontaneous speech, and about the effect of contextual factors on tonal realization. The present study investigates the pitch contours of two-character words with T2-T3 and T3-T3 tone patterns in spontaneous Taiwan Mandarin conversations. Our analysis makes use of the Generative Additive Mixed Model (GAMM, Wood, 2017) to examine fundamental frequency (f0) contours as a function of normalized time. We consider various factors known to influence pitch contours, including gender, speaking rate, speaker, neighboring tones, word position, bigram probability, and also novel predictors, word and word sense (Chuang et al., 2024). Our analyses revealed that in spontaneous Taiwan Mandarin, T3-T3 words become indistinguishable from T2-T3 words, indicating complete sandhi, once the strong effect of word (or word sense) is taken into account. For our data, the shape of f0 contours is not co-determined by word frequency. In contrast, the effect of word meaning on f0 contours is robust, as strong as the effect of adjacent tones, and is present for both T2-T3 and T3-T3 words.

Related papers

The realization of tones in spontaneous spoken Taiwan Mandarin: a corpus-based survey and theory-driven computational modeling [1.7723990552388866]
This study investigates the tonal realization of Mandarin disyllabic words with all 20 possible combinations of two tones. The results show that meaning in context and phonetic realization are far more entangled than standard linguistic theory predicts.
arXiv Detail & Related papers (2025-03-29T17:39:55Z)
A corpus-based investigation of pitch contours of monosyllabic words in conversational Taiwan Mandarin [3.072340427031969]
We analyze the F0 contours of 3824 tokens of 63 different word types in a spontaneous Taiwan Mandarin corpus. We show that the tonal context substantially modify a word's canonical tone. We also show that word, and even more so, word sense, co-determine words' F0 contours.
arXiv Detail & Related papers (2024-09-12T09:51:56Z)
Word-specific tonal realizations in Mandarin [0.9249657468385781]
This study shows that tonal realization is also partially determined by words' meanings. We first show, on the basis of a Taiwan corpus of spontaneous conversations, that word type is a stronger predictor of pitch realization than all the previously established word-form related predictors combined. We then proceed to show, using computational modeling with context-specific word embeddings, that token-specific pitch contours predict word type with 50% accuracy on held-out data.
arXiv Detail & Related papers (2024-05-11T13:00:35Z)
MuLan: A Study of Fact Mutability in Language Models [50.626787909759976]
Trustworthy language models ideally identify mutable facts as such and process them accordingly. We create MuLan, a benchmark for evaluating the ability of English language models to anticipate time-contingency.
arXiv Detail & Related papers (2024-04-03T19:47:33Z)
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word Segmentation [0.0]
Tone sandhi, which applies to all multi-syllabic words in Shanghainese, is key to natural-sounding speech. Recent work on Shanghainese TTS (text-to-speech) such as Apple's VoiceOver has shown poor performance with tone sandhi. I show that word segmentation during text preprocessing can improve the quality of tone sandhi production in TTS models.
arXiv Detail & Related papers (2023-07-30T10:50:18Z)
Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained language models [0.0]
Modern pretrained language models, such as BERT, RoBERTa and GPT-3 hold the promise of performing better on logical tasks than classic static word embeddings. We investigate the extent to which BERT, RoBERTa, GPT-2 and GPT-3 exhibit general, human-like, knowledge of these common words. We find that despite capturing some aspects of logical meaning, the models fall far short of human performance.
arXiv Detail & Related papers (2023-05-25T18:56:26Z)
Cross-strait Variations on Two Near-synonymous Loanwords xie2shang1 and tan2pan4: A Corpus-based Comparative Study [2.6194322370744305]
This study attempts to investigate cross-strait variations on two typical synonymous loanwords in Chinese, i.e. xie2shang1 and tan2pan4. Through a comparative analysis, the study found some distributional, eventual, and contextual similarities and differences across Taiwan and Mainland Mandarin.
arXiv Detail & Related papers (2022-10-09T04:10:58Z)
Controllable Accented Text-to-Speech Synthesis [76.80549143755242]
We propose a neural TTS architecture that allows us to control the accent and its intensity during inference. This is the first study of accented TTS synthesis with explicit intensity control.
arXiv Detail & Related papers (2022-09-22T06:13:07Z)
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech [88.22544315633687]
Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech systems. We propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy.
arXiv Detail & Related papers (2022-06-05T10:50:34Z)
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation [61.564874831498145]
TranSpeech is a speech-to-speech translation model with bilateral perturbation. We establish a non-autoregressive S2ST technique, which repeatedly masks and predicts unit choices. TranSpeech shows a significant improvement in inference latency, enabling speedup up to 21.4x than autoregressive technique.
arXiv Detail & Related papers (2022-05-25T06:34:14Z)
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style [111.89762723159677]
We develop AdaSpeech 3, an adaptive TTS system that fine-tunes a well-trained reading-style TTS model for spontaneous-style speech. AdaSpeech 3 synthesizes speech with natural FP and rhythms in spontaneous styles, and achieves much better MOS and SMOS scores than previous adaptive TTS systems.
arXiv Detail & Related papers (2021-07-06T10:40:45Z)
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS [74.11899135025503]
We extend the Tacotron-based speech synthesis framework to explicitly model the prosodic phrase breaks. We show that our proposed training scheme consistently improves the voice quality for both Chinese and Mongolian systems.
arXiv Detail & Related papers (2020-08-11T07:57:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.