A Character-level Span-based Model for Mandarin Prosodic Structure
Prediction
- URL: http://arxiv.org/abs/2203.16922v1
- Date: Thu, 31 Mar 2022 09:47:08 GMT
- Title: A Character-level Span-based Model for Mandarin Prosodic Structure
Prediction
- Authors: Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen,
Zhongqin Wu, Helen Meng
- Abstract summary: We propose a span-based Mandarin prosodic structure prediction model to obtain an optimal prosodic structure tree.
Rich linguistic features are provided by Chinese character-level BERT and sent to encoder with self-attention architecture.
The proposed method can predict prosodic labels of different levels at the same time and accomplish the process directly from Chinese characters.
- Score: 36.90699361223442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The accuracy of prosodic structure prediction is crucial to the naturalness
of synthesized speech in Mandarin text-to-speech system, but now is limited by
widely-used sequence-to-sequence framework and error accumulation from previous
word segmentation results. In this paper, we propose a span-based Mandarin
prosodic structure prediction model to obtain an optimal prosodic structure
tree, which can be converted to corresponding prosodic label sequence. Instead
of the prerequisite for word segmentation, rich linguistic features are
provided by Chinese character-level BERT and sent to encoder with
self-attention architecture. On top of this, span representation and label
scoring are used to describe all possible prosodic structure trees, of which
each tree has its corresponding score. To find the optimal tree with the
highest score for a given sentence, a bottom-up CKY-style algorithm is further
used. The proposed method can predict prosodic labels of different levels at
the same time and accomplish the process directly from Chinese characters in an
end-to-end manner. Experiment results on two real-world datasets demonstrate
the excellent performance of our span-based method over all
sequence-to-sequence baseline approaches.
Related papers
- Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure [11.184330703168893]
This paper proposes modeling latent internal structures within words in Chinese.
A constrained Eisner algorithm is implemented to ensure the compatibility of character-level trees.
A detailed analysis reveals that a coarse-to-fine parsing strategy empowers the model to predict more linguistically plausible intra-word structures.
arXiv Detail & Related papers (2024-06-06T06:23:02Z) - Explicit Syntactic Guidance for Neural Text Generation [45.60838824233036]
Generative Grammar suggests that humans generate natural language texts by learning language grammar.
We propose a syntax-guided generation schema, which generates the sequence guided by a constituency parse tree in a top-down direction.
Experiments on paraphrase generation and machine translation show that the proposed method outperforms autoregressive baselines.
arXiv Detail & Related papers (2023-06-20T12:16:31Z) - Joint Chinese Word Segmentation and Span-based Constituency Parsing [11.080040070201608]
This work proposes a method for joint Chinese word segmentation and Span-based Constituency Parsing by adding extra labels to individual Chinese characters on the parse trees.
Through experiments, the proposed algorithm outperforms the recent models for joint segmentation and constituency parsing on CTB 5.1.
arXiv Detail & Related papers (2022-11-03T08:19:00Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - Discontinuous Grammar as a Foreign Language [0.7412445894287709]
We extend the framework of sequence-to-sequence models for constituent parsing.
We design several novelizations that can fully produce discontinuities.
For the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks.
arXiv Detail & Related papers (2021-10-20T08:58:02Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - Compositional Generalization via Semantic Tagging [81.24269148865555]
We propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models.
We show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.
arXiv Detail & Related papers (2020-10-22T15:55:15Z) - Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input.
On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z) - Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity.
Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.