Neural Grapheme-to-Phoneme Conversion with Pre-trained Grapheme Models
- URL: http://arxiv.org/abs/2201.10716v1
- Date: Wed, 26 Jan 2022 02:49:56 GMT
- Title: Neural Grapheme-to-Phoneme Conversion with Pre-trained Grapheme Models
- Authors: Lu Dong, Zhi-Qiang Guo, Chao-Hong Tan, Ya-Jun Hu, Yuan Jiang and
Zhen-Hua Ling
- Abstract summary: This paper proposes a pre-trained grapheme model called grapheme BERT (GBERT)
GBERT is built by self-supervised training on a large, language-specific word list with only grapheme information.
Two approaches are developed to incorporate GBERT into the state-of-the-art Transformer-based G2P model.
- Score: 35.60380484684335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural network models have achieved state-of-the-art performance on
grapheme-to-phoneme (G2P) conversion. However, their performance relies on
large-scale pronunciation dictionaries, which may not be available for a lot of
languages. Inspired by the success of the pre-trained language model BERT, this
paper proposes a pre-trained grapheme model called grapheme BERT (GBERT), which
is built by self-supervised training on a large, language-specific word list
with only grapheme information. Furthermore, two approaches are developed to
incorporate GBERT into the state-of-the-art Transformer-based G2P model, i.e.,
fine-tuning GBERT or fusing GBERT into the Transformer model by attention.
Experimental results on the Dutch, Serbo-Croatian, Bulgarian and Korean
datasets of the SIGMORPHON 2021 G2P task confirm the effectiveness of our
GBERT-based G2P models under both medium-resource and low-resource data
conditions.
Related papers
- A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - RecurrentGemma: Moving Past Transformers for Efficient Open Language Models [103.59785165735727]
We introduce RecurrentGemma, a family of open language models using Google's novel Griffin architecture.
Griffin combines linear recurrences with local attention to achieve excellent performance on language.
We provide two sizes of models, containing 2B and 9B parameters, and provide pre-trained and instruction tuned variants for both.
arXiv Detail & Related papers (2024-04-11T15:27:22Z) - Multilingual Translation via Grafting Pre-trained Language Models [12.787188625198459]
We propose Graformer to graft separately pre-trained (masked) language models for machine translation.
With monolingual data for pre-training and parallel data for grafting training, we maximally take advantage of the usage of both types of data.
arXiv Detail & Related papers (2021-09-11T10:57:45Z) - Stage-wise Fine-tuning for Graph-to-Text Generation [25.379346921398326]
Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders.
We propose a structured graph-to-text model with a two-step fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting to the graph-to-text generation.
arXiv Detail & Related papers (2021-05-17T17:15:29Z) - Paraphrastic Representations at Scale [134.41025103489224]
We release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese languages.
We train these models on large amounts of data, achieving significantly improved performance from the original papers.
arXiv Detail & Related papers (2021-04-30T16:55:28Z) - PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS [27.20479869682578]
PnG BERT is a new encoder model for neural TTS.
It can be pre-trained on a large text corpus in a self-supervised manner.
arXiv Detail & Related papers (2021-03-28T06:24:00Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Unsupervised Paraphrase Generation using Pre-trained Language Models [0.0]
OpenAI's GPT-2 is notable for its capability to generate fluent, well formulated, grammatically consistent text.
We leverage this generation capability of GPT-2 to generate paraphrases without any supervision from labelled data.
Our experiments show that paraphrases generated with our model are of good quality, are diverse and improves the downstream task performance when used for data augmentation.
arXiv Detail & Related papers (2020-06-09T19:40:19Z) - CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via
Cycle Training [63.11444020743543]
Deep learning models for graph-to-text (G2T) and text-to-graph (T2G) conversion suffer from scarce training data.
We present CycleGT, an unsupervised training method that can bootstrap from non-parallel graph and text data, and iteratively back translate between the two forms.
arXiv Detail & Related papers (2020-06-08T15:59:00Z) - Abstractive Text Summarization based on Language Model Conditioning and
Locality Modeling [4.525267347429154]
We train a Transformer-based neural model on the BERT language model.
In addition, we propose a new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size.
The results of our models are compared to a baseline and the state-of-the-art models on the CNN/Daily Mail dataset.
arXiv Detail & Related papers (2020-03-29T14:00:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.