A Discriminative Entity-Aware Language Model for Virtual Assistants
- URL: http://arxiv.org/abs/2106.11292v1
- Date: Mon, 21 Jun 2021 17:50:28 GMT
- Title: A Discriminative Entity-Aware Language Model for Virtual Assistants
- Authors: Mandana Saebi, Ernest Pusateri, Aaksha Meghawat, Christophe Van Gysel
- Abstract summary: High-quality automatic speech recognition (ASR) is essential for virtual assistants (VAs) to work well.
In this work, we start from the observation that many ASR errors on named entities are inconsistent with real-world knowledge.
We extend previous discriminative n-gram language modeling approaches to incorporate real-world knowledge from a Knowledge Graph.
- Score: 4.2854663014000876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-quality automatic speech recognition (ASR) is essential for virtual
assistants (VAs) to work well. However, ASR often performs poorly on VA
requests containing named entities. In this work, we start from the observation
that many ASR errors on named entities are inconsistent with real-world
knowledge. We extend previous discriminative n-gram language modeling
approaches to incorporate real-world knowledge from a Knowledge Graph (KG),
using features that capture entity type-entity and entity-entity relationships.
We apply our model through an efficient lattice rescoring process, achieving
relative sentence error rate reductions of more than 25% on some synthesized
test sets covering less popular entities, with minimal degradation on a
uniformly sampled VA test set.
Related papers
- Continuously Learning New Words in Automatic Speech Recognition [56.972851337263755]
We propose an self-supervised continual learning approach to recognize new words.
We use a memory-enhanced Automatic Speech Recognition model from previous work.
We show that with this approach, we obtain increasing performance on the new words when they occur more frequently.
arXiv Detail & Related papers (2024-01-09T10:39:17Z) - Improved Contextual Recognition In Automatic Speech Recognition Systems
By Semantic Lattice Rescoring [4.819085609772069]
We propose a novel approach for enhancing contextual recognition within ASR systems via semantic lattice processing.
Our solution consists of using Hidden Markov Models and Gaussian Mixture Models (HMM-GMM) along with Deep Neural Networks (DNN) models for better accuracy.
We demonstrate the effectiveness of our proposed framework on the LibriSpeech dataset with empirical analyses.
arXiv Detail & Related papers (2023-10-14T23:16:05Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Record Deduplication for Entity Distribution Modeling in ASR Transcripts [0.0]
We use record deduplication to retrieve 95% of misrecognized entities.
When used for contextual biasing, our method shows an estimated 5% relative word error rate reduction.
arXiv Detail & Related papers (2023-06-09T20:42:11Z) - Space-Efficient Representation of Entity-centric Query Language Models [8.712427362992237]
We introduce a deterministic approximation to probabilistic grammars that avoids the explicit expansion of non-terminals at model creation time.
We obtain a 10% relative word error rate improvement on long tail entity queries compared to when a similarly-sized n-gram model is used.
arXiv Detail & Related papers (2022-06-29T19:59:50Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.