Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems
- URL: http://arxiv.org/abs/2003.09024v1
- Date: Thu, 19 Mar 2020 21:24:45 GMT
- Title: Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems
- Authors: Nikolay Malkovsky, Vladimir Bataev, Dmitrii Sviridkin, Natalia
Kizhaeva, Aleksandr Laptev, Ildar Valiev, Oleg Petrov
- Abstract summary: The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
- Score: 54.49880724137688
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of out of vocabulary words (OOV) is typical for any speech
recognition system, hybrid systems are usually constructed to recognize a fixed
set of words and rarely can include all the words that will be encountered
during exploitation of the system. One of the popular approach to cover OOVs is
to use subword units rather then words. Such system can potentially recognize
any previously unseen word if the word can be constructed from present subword
units, but also non-existing words can be recognized. The other popular
approach is to modify HMM part of the system so that it can be easily and
effectively expanded with custom set of words we want to add to the system. In
this paper we explore different existing methods of this solution on both graph
construction and search method levels. We also present a novel vocabulary
expansion techniques which solve some common internal subroutine problems
regarding recognition graph processing.
Related papers
- Short-Term Word-Learning in a Dynamically Changing Environment [63.025297637716534]
We show how to supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
We demonstrate significant improvements in the detection rate of new words with only a minor increase in false alarms.
arXiv Detail & Related papers (2022-03-29T10:05:39Z) - Spell my name: keyword boosted speech recognition [25.931897154065663]
uncommon words such as names and technical terminology are important to understanding conversations in context.
We propose a simple but powerful ASR decoding method that can better recognise these uncommon keywords.
The method boosts the probabilities of given keywords in a beam search based on acoustic model predictions.
We demonstrate the effectiveness of our method on the LibriSpeeech test sets and also internal data of real-world conversations.
arXiv Detail & Related papers (2021-10-06T14:16:57Z) - A Comparison of Methods for OOV-word Recognition on a New Public Dataset [0.0]
We propose using the CommonVoice dataset to create test sets for languages with a high out-of-vocabulary ratio.
We then evaluate, within the context of a hybrid ASR system, how much better subword models are at recognizing OOVs.
We propose a new method for modifying a subword-based language model so as to better recognize OOV-words.
arXiv Detail & Related papers (2021-07-16T19:39:30Z) - LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution [76.615287796753]
We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models.
This is achieved by combining contextual information with knowledge from structured lexical resources.
Our experiments show that LexSubCon outperforms previous state-of-the-art methods on LS07 and CoInCo benchmark datasets.
arXiv Detail & Related papers (2021-07-11T21:25:56Z) - Instant One-Shot Word-Learning for Context-Specific Neural
Sequence-to-Sequence Speech Recognition [62.997667081978825]
We present an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
In this paper we demonstrate that through this mechanism our system is able to recognize more than 85% of newly added words that it previously failed to recognize.
arXiv Detail & Related papers (2021-07-05T21:08:34Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Accurate Word Representations with Universal Visual Guidance [55.71425503859685]
This paper proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance.
We build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images.
Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach.
arXiv Detail & Related papers (2020-12-30T09:11:50Z) - SubICap: Towards Subword-informed Image Captioning [37.42085521950802]
We decompose words into smaller constituent units'subwords' and represent captions as a sequence of subwords instead of words.
Our captioning system improves various metric scores, with a training vocabulary size approximately 90% less than the baseline.
arXiv Detail & Related papers (2020-12-24T06:10:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.