LAST: Scalable Lattice-Based Speech Modelling in JAX
- URL: http://arxiv.org/abs/2304.13134v1
- Date: Tue, 25 Apr 2023 20:25:37 GMT
- Title: LAST: Scalable Lattice-Based Speech Modelling in JAX
- Authors: Ke Wu, Ehsan Variani, Tom Bagby, Michael Riley
- Abstract summary: We introduce LAST, a LAttice-based Speech Transducer library in JAX.
Last implements differentiable weighted finite state automaton (WFSA) algorithms needed for training & inference that scale to a large WFSA.
We describe a suite of generally applicable techniques employed in LAST to address these challenges, and demonstrate their effectiveness with benchmarks on TPUv3 and V100 GPU.
- Score: 11.682949982063477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce LAST, a LAttice-based Speech Transducer library in JAX. With an
emphasis on flexibility, ease-of-use, and scalability, LAST implements
differentiable weighted finite state automaton (WFSA) algorithms needed for
training \& inference that scale to a large WFSA such as a recognition lattice
over the entire utterance. Despite these WFSA algorithms being well-known in
the literature, new challenges arise from performance characteristics of modern
architectures, and from nuances in automatic differentiation. We describe a
suite of generally applicable techniques employed in LAST to address these
challenges, and demonstrate their effectiveness with benchmarks on TPUv3 and
V100 GPU.
Related papers
- IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization [59.06663981902496]
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.
We investigate two indispensable characteristics that the LLMs-based QFS models should be harnessed, Lengthy Document Summarization and Efficiently Fine-grained Query-LLM Alignment.
These innovations pave the way for broader application and accessibility in the field of QFS technology.
arXiv Detail & Related papers (2024-07-15T07:14:56Z) - Structural Pruning of Pre-trained Language Models via Neural Architecture Search [7.833790713816726]
Pre-trained language models (PLM) mark the state-of-the-art for natural language understanding task when fine-tuned on labeled data.
This paper explores neural architecture search (NAS) for structural pruning to find sub-parts of the fine-tuned network that optimally trade-off efficiency.
arXiv Detail & Related papers (2024-05-03T17:34:57Z) - Slax: A Composable JAX Library for Rapid and Flexible Prototyping of Spiking Neural Networks [0.19427883580687189]
We introduce Slax, a JAX-based library designed to accelerate SNN algorithm design.
Slax provides optimized implementations of diverse training algorithms, allowing direct performance comparison.
arXiv Detail & Related papers (2024-04-08T18:15:13Z) - ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching [9.884452250478216]
We propose ALISA, a novel algorithm-system co-design solution to address the challenges imposed by KV caching.
On the algorithm level, ALISA prioritizes tokens that are most important in generating a new token via a Sparse Window Attention (SWA) algorithm.
On the system level, ALISA employs three-phase token-level dynamical scheduling and optimize the trade-off between caching and recomputation.
arXiv Detail & Related papers (2024-03-26T01:46:34Z) - A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm.
Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources.
We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z) - Global Knowledge Calibration for Fast Open-Vocabulary Segmentation [124.74256749281625]
We introduce a text diversification strategy that generates a set of synonyms for each training category.
We also employ a text-guided knowledge distillation method to preserve the generalizable knowledge of CLIP.
Our proposed model achieves robust generalization performance across various datasets.
arXiv Detail & Related papers (2023-03-16T09:51:41Z) - Gradient Backpropagation based Feature Attribution to Enable
Explainable-AI on the Edge [1.7338677787507768]
In this work, we analyze the dataflow of gradient backpropagation based feature attribution algorithms to determine the resource overhead required over inference.
We develop a High-Level Synthesis (HLS) based FPGA design that is targeted for edge devices and supports three feature attribution algorithms.
Our design methodology demonstrates a pathway to repurpose inference accelerators to support feature attribution with minimal overhead, thereby enabling real-time XAI on the edge.
arXiv Detail & Related papers (2022-10-19T22:58:59Z) - SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition [49.42625022146008]
We present the advantages of applying SRU++ in ASR tasks by comparing with Conformer across multiple ASR benchmarks.
Specifically, SRU++ can surpass Conformer on long-form speech input with a large margin, based on our analysis.
arXiv Detail & Related papers (2021-10-11T19:23:50Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results.
Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples.
Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.