Related papers: Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

URL: http://arxiv.org/abs/2008.06580v2
Date: Sun, 28 Feb 2021 19:41:45 GMT
Title: Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview
Authors: Peter Bell, Joachim Fainberg, Ondrej Klejch, Jinyu Li, Steve Renals, Pawel Swietojanski
Abstract summary: We present a structured overview of adaptation algorithms for neural network-based speech recognition. The overview characterizes adaptation algorithms as based on embeddings, model parameter adaptation, or data augmentation. We present a meta-analysis of the performance of speech recognition adaptation algorithms, based on relative error rate reductions as reported in the literature.
Score: 43.12352697785169
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / neural network systems and end-to-end neural network systems, with a focus on speaker adaptation, domain adaptation, and accent adaptation. The overview characterizes adaptation algorithms as based on embeddings, model parameter adaptation, or data augmentation. We present a meta-analysis of the performance of speech recognition adaptation algorithms, based on relative error rate reductions as reported in the literature.

Related papers

Neural Speech and Audio Coding [19.437080345021105]
The paper explores the integration of model-based and data-driven approaches within the realm of neural speech and audio coding systems. It introduces a neural network-based signal enhancer designed to post-process existing codecs' output. The paper examines the use of psychoacoustically calibrated loss functions to train end-to-end neural audio codecs.
arXiv Detail & Related papers (2024-08-13T15:13:21Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
Training neural networks with structured noise improves classification and generalization [0.0]
We show how adding structure to noisy training data can substantially improve the algorithm performance. We also prove that the so-called Hebbian Unlearning rule coincides with the training-with-noise algorithm when noise is maximal.
arXiv Detail & Related papers (2023-02-26T22:10:23Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Full-Reference Speech Quality Estimation with Attentional Siamese Neural Networks [0.0]
We present a full-reference speech quality prediction model with a deep learning approach. The model determines a feature representation of the reference and the degraded signal through a siamese recurrent convolutional network. The resulting features are then used to align the signals with an attention mechanism and are finally combined to estimate the overall speech quality.
arXiv Detail & Related papers (2021-05-03T12:38:25Z)
Sparse Mixture of Local Experts for Efficient Speech Enhancement [19.645016575334786]
We investigate a deep learning approach for speech denoising through an efficient ensemble of specialist neural networks. By splitting up the speech denoising task into non-overlapping subproblems, we are able to improve denoising performance while also reducing computational complexity. Our findings demonstrate that a fine-tuned ensemble network is able to exceed the speech denoising capabilities of a generalist network.
arXiv Detail & Related papers (2020-05-16T23:23:22Z)
AutoSpeech: Neural Architecture Search for Speaker Recognition [108.69505815793028]
We propose the first neural architecture search approach approach for the speaker recognition tasks, named as AutoSpeech. Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times. Results demonstrate that the derived CNN architectures significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 back-bones, while enjoying lower model complexity.
arXiv Detail & Related papers (2020-05-07T02:53:47Z)
Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies. We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.