GraphTTS: graph-to-sequence modelling in neural text-to-speech
- URL: http://arxiv.org/abs/2003.01924v1
- Date: Wed, 4 Mar 2020 07:44:55 GMT
- Title: GraphTTS: graph-to-sequence modelling in neural text-to-speech
- Authors: Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, Jing Xiao
- Abstract summary: This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS)
It maps the graph embedding of the input sequence to spectrograms.
Applying the encoder of GraphTTS as a graph auxiliary encoder (GAE) can analyse prosody information from the semantic structure of texts.
- Score: 34.54061333255853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper leverages the graph-to-sequence method in neural text-to-speech
(GraphTTS), which maps the graph embedding of the input sequence to
spectrograms. The graphical inputs consist of node and edge representations
constructed from input texts. The encoding of these graphical inputs
incorporates syntax information by a GNN encoder module. Besides, applying the
encoder of GraphTTS as a graph auxiliary encoder (GAE) can analyse prosody
information from the semantic structure of texts. This can remove the manual
selection of reference audios process and makes prosody modelling an end-to-end
procedure. Experimental analysis shows that GraphTTS outperforms the
state-of-the-art sequence-to-sequence models by 0.24 in Mean Opinion Score
(MOS). GAE can adjust the pause, ventilation and tones of synthesised audios
automatically. This experimental conclusion may give some inspiration to
researchers working on improving speech synthesis prosody.
Related papers
- Explanation Graph Generation via Generative Pre-training over Synthetic
Graphs [6.25568933262682]
The generation of explanation graphs is a significant task that aims to produce explanation graphs in response to user input.
Current research commonly fine-tunes a text-based pre-trained language model on a small downstream dataset that is annotated with labeled graphs.
We propose a novel pre-trained framework EG3P(for Explanation Graph Generation via Generative Pre-training over synthetic graphs) for the explanation graph generation task.
arXiv Detail & Related papers (2023-06-01T13:20:22Z) - KENGIC: KEyword-driven and N-Gram Graph based Image Captioning [0.988326119238361]
Keywords-driven and N-gram Graph based approach for Image Captioning (KENGIC)
Model is designed to form a directed graph by connecting nodes through overlapping n-grams as found in a given text corpus.
Analysis of this approach could also shed light on the generation process behind current top performing caption generators trained in the paired setting.
arXiv Detail & Related papers (2023-02-07T19:48:55Z) - Graph-to-Text Generation with Dynamic Structure Pruning [19.37474618180399]
We propose a Structure-Aware Cross-Attention (SACA) mechanism to re-encode the input graph representation conditioning on the newly generated context.
We achieve new state-of-the-art results on two graph-to-text datasets, LDC2020T02 and ENT-DESC, with only minor increase on computational cost.
arXiv Detail & Related papers (2022-09-15T12:48:10Z) - Learning Graphon Autoencoders for Generative Graph Modeling [91.32624399902755]
Graphon is a nonparametric model that generates graphs with arbitrary sizes and can be induced from graphs easily.
We propose a novel framework called textitgraphon autoencoder to build an interpretable and scalable graph generative model.
A linear graphon factorization model works as a decoder, leveraging the latent representations to reconstruct the induced graphons.
arXiv Detail & Related papers (2021-05-29T08:11:40Z) - Structural Information Preserving for Graph-to-Text Generation [59.00642847499138]
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.
Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.
arXiv Detail & Related papers (2021-02-12T20:09:01Z) - Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures.
Our findings motivate solutions to enrich the quality of models' implicit graph encodings.
We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z) - GraphPB: Graphical Representations of Prosody Boundary in Speech
Synthesis [23.836992815219904]
This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis.
The nodes of the graph embedding are formed by prosodic words, and the edges are formed by the other prosodic boundaries.
Two techniques are proposed to embed sequential information into the graph-to-sequence text-to-speech model.
arXiv Detail & Related papers (2020-12-03T03:34:05Z) - GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech
Synthesis [79.1885389845874]
Transformer-based end-to-end text-to-speech synthesis (TTS) is one of such successful implementations.
We propose a novel neural TTS model, denoted as GraphSpeech, that is formulated under graph neural network framework.
Experiments show that GraphSpeech consistently outperforms the Transformer TTS baseline in terms of spectrum and prosody rendering of utterances.
arXiv Detail & Related papers (2020-10-23T14:14:06Z) - Graph-to-Sequence Neural Machine Translation [79.0617920270817]
We propose a graph-based SAN-based NMT model called Graph-Transformer.
Subgraphs are put into different groups according to their orders, and every group of subgraphs respectively reflect different levels of dependency between words.
Our method can effectively boost the Transformer with an improvement of 1.1 BLEU points on WMT14 English-German dataset and 1.0 BLEU points on IWSLT14 German-English dataset.
arXiv Detail & Related papers (2020-09-16T06:28:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.