Position-Invariant Truecasing with a Word-and-Character Hierarchical
Recurrent Neural Network
- URL: http://arxiv.org/abs/2108.11943v1
- Date: Thu, 26 Aug 2021 17:54:35 GMT
- Title: Position-Invariant Truecasing with a Word-and-Character Hierarchical
Recurrent Neural Network
- Authors: Hao Zhang and You-Chi Cheng and Shankar Kumar and Mingqing Chen and
Rajiv Mathews
- Abstract summary: We propose a fast, accurate and compact two-level hierarchical word-and-character-based recurrent neural network model.
We also address the problem of truecasing while ignoring token positions in the sentence.
- Score: 10.425277173548212
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Truecasing is the task of restoring the correct case (uppercase or lowercase)
of noisy text generated either by an automatic system for speech recognition or
machine translation or by humans. It improves the performance of downstream NLP
tasks such as named entity recognition and language modeling. We propose a
fast, accurate and compact two-level hierarchical word-and-character-based
recurrent neural network model, the first of its kind for this problem. Using
sequence distillation, we also address the problem of truecasing while ignoring
token positions in the sentence, i.e. in a position-invariant manner.
Related papers
- LM-assisted keyword biasing with Aho-Corasick algorithm for Transducer-based ASR [3.841280537264271]
We propose a light on-the-fly method to improve automatic speech recognition performance.
We combine a bias list of named entities with a word-level n-gram language model with the shallow fusion approach based on the Aho-Corasick string matching algorithm.
We achieve up to 21.6% relative improvement in the general word error rate with no practical difference in the inverse real-time factor.
arXiv Detail & Related papers (2024-09-20T13:53:37Z) - Self-consistent context aware conformer transducer for speech recognition [0.06008132390640294]
We introduce a novel neural network module that adeptly handles recursive data flow in neural network architectures.
Our method notably improves the accuracy of recognizing rare words without adversely affecting the word error rate for common vocabulary.
Our findings reveal that the combination of both approaches can improve the accuracy of detecting rare words by as much as 4.5 times.
arXiv Detail & Related papers (2024-02-09T18:12:11Z) - Mapping of attention mechanisms to a generalized Potts model [50.91742043564049]
We show that training a neural network is exactly equivalent to solving the inverse Potts problem by the so-called pseudo-likelihood method.
We also compute the generalization error of self-attention in a model scenario analytically using the replica method.
arXiv Detail & Related papers (2023-04-14T16:32:56Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Word Order Matters when you Increase Masking [70.29624135819884]
We study the effect of removing position encodings on the pre-training objective itself, to test whether models can reconstruct position information from co-occurrences alone.
We find that the necessity of position information increases with the amount of masking, and that masked language models without position encodings are not able to reconstruct this information on the task.
arXiv Detail & Related papers (2022-11-08T18:14:04Z) - Speaker Embedding-aware Neural Diarization: a Novel Framework for
Overlapped Speech Diarization in the Meeting Scenario [51.5031673695118]
We reformulate overlapped speech diarization as a single-label prediction problem.
We propose the speaker embedding-aware neural diarization (SEND) system.
arXiv Detail & Related papers (2022-03-18T06:40:39Z) - Capitalization Normalization for Language Modeling with an Accurate and
Efficient Hierarchical RNN Model [12.53710938104476]
We propose a fast, accurate and compact two-level hierarchical word-and-character-based recurrent neural network model.
We use the truecaser to normalize user-generated text in a Federated Learning framework for language modeling.
arXiv Detail & Related papers (2022-02-16T16:21:53Z) - Preliminary study on using vector quantization latent spaces for TTS/VC
systems with consistent performance [55.10864476206503]
We investigate the use of quantized vectors to model the latent linguistic embedding.
By enforcing different policies over the latent spaces in the training, we are able to obtain a latent linguistic embedding.
Our experiments show that the voice cloning system built with vector quantization has only a small degradation in terms of perceptive evaluations.
arXiv Detail & Related papers (2021-06-25T07:51:35Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.