PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for
End-to-End ASR
- URL: http://arxiv.org/abs/2005.09824v1
- Date: Wed, 20 May 2020 02:10:21 GMT
- Title: PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for
End-to-End ASR
- Authors: Yiwen Shao, Yiming Wang, Daniel Povey, Sanjeev Khudanpur
- Abstract summary: PyChain is an implementation of end-to-end lattice-free maximum mutual information (LF-MMI) training for the so-called emphchain models in the Kaldi automatic speech recognition (ASR) toolkit.
Unlike other PyTorch and Kaldi based ASR toolkits, PyChain is designed to be as flexible and light-weight as possible.
- Score: 65.20342293605472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present PyChain, a fully parallelized PyTorch implementation of end-to-end
lattice-free maximum mutual information (LF-MMI) training for the so-called
\emph{chain models} in the Kaldi automatic speech recognition (ASR) toolkit.
Unlike other PyTorch and Kaldi based ASR toolkits, PyChain is designed to be as
flexible and light-weight as possible so that it can be easily plugged into new
ASR projects, or other existing PyTorch-based ASR tools, as exemplified
respectively by a new project PyChain-example, and Espresso, an existing
end-to-end ASR toolkit. PyChain's efficiency and flexibility is demonstrated
through such novel features as full GPU training on numerator/denominator
graphs, and support for unequal length sequences. Experiments on the WSJ
dataset show that with simple neural networks and commonly used machine
learning techniques, PyChain can achieve competitive results that are
comparable to Kaldi and better than other end-to-end ASR systems.
Related papers
- shapiq: Shapley Interactions for Machine Learning [21.939393765684827]
We introduce shapiq, an open-source Python package that unifies state-of-the-art algorithms to efficiently compute Shapley Value (SV) and Shapley Interactions (SIs)
For practitioners, shapiq is able to explain and visualize any-order feature interactions in predictions of models, including vision transformers, language models, as well as XGBoost and LightGBM with TreeShap-IQ.
arXiv Detail & Related papers (2024-10-02T15:16:53Z) - MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for
Accelerating Vision-Language Transformer [66.71930982549028]
Vision-Language Transformers (VLTs) have shown great success recently, but are accompanied by heavy computation costs.
We propose a novel framework named Multimodal Alignment-Guided Dynamic Token Pruning (MADTP) for accelerating various VLTs.
arXiv Detail & Related papers (2024-03-05T14:13:50Z) - Sig-Networks Toolkit: Signature Networks for Longitudinal Language
Modelling [14.619019557308807]
We present an open-source, pip installable toolkit, Sig-Networks, for longitudinal language modelling.
A central focus is the incorporation of Signature-based Neural Network models, which have recently shown success in temporal tasks.
We release the Toolkit as a PyTorch package with an introductory video, Git repositories for preprocessing and modelling including sample notebooks on the modeled NLP tasks.
arXiv Detail & Related papers (2023-12-06T14:34:30Z) - PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time
Series [0.0]
PyPOTS is an open-source Python library dedicated to data mining and analysis on partially-observed time series.
It provides easy access to diverse algorithms categorized into four tasks: imputation, classification, clustering, and forecasting.
arXiv Detail & Related papers (2023-05-30T07:57:05Z) - Interpretable Machine Learning for Science with PySR and
SymbolicRegression.jl [0.0]
PySR is an open-source library for practical symbolic regression.
It is built on a high-performance distributed back-end, a flexible search algorithm, and interfaces with several deep learning packages.
In describing this software, we also introduce a new benchmark, "EmpiricalBench," to quantify the applicability of symbolic regression algorithms in science.
arXiv Detail & Related papers (2023-05-02T16:31:35Z) - Continual Inference: A Library for Efficient Online Inference with Deep
Neural Networks in PyTorch [97.03321382630975]
Continual Inference is a Python library for implementing Continual Inference Networks (CINs) in PyTorch.
We offer a comprehensive introduction to CINs and their implementation in practice, and provide best-practices and code examples for composing complex modules for modern Deep Learning.
arXiv Detail & Related papers (2022-04-07T13:03:09Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit
of Kaldi [7.9019242334556745]
This paper describes the "ExKaldi-RT," an online ASR toolkit implemented based on Kaldi and Python language.
ExKaldi-RT provides tools for providing a real-time audio stream pipeline, extracting acoustic features, transmitting packets with a remote connection, estimating acoustic probabilities with a neural network, and online decoding.
arXiv Detail & Related papers (2021-04-03T12:16:19Z) - PyHealth: A Python Library for Health Predictive Models [53.848478115284195]
PyHealth is an open-source Python toolbox for developing various predictive models on healthcare data.
The data preprocessing module enables the transformation of complex healthcare datasets into machine learning friendly formats.
The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches.
arXiv Detail & Related papers (2021-01-11T22:02:08Z) - Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic
Circuits [99.59941892183454]
We propose Einsum Networks (EiNets), a novel implementation design for PCs.
At their core, EiNets combine a large number of arithmetic operations in a single monolithic einsum-operation.
We show that the implementation of Expectation-Maximization (EM) can be simplified for PCs, by leveraging automatic differentiation.
arXiv Detail & Related papers (2020-04-13T23:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.