A Token-Wise Beam Search Algorithm for RNN-T
- URL: http://arxiv.org/abs/2302.14357v2
- Date: Thu, 5 Oct 2023 22:34:13 GMT
- Title: A Token-Wise Beam Search Algorithm for RNN-T
- Authors: Gil Keren
- Abstract summary: We present a decoding beam search algorithm that batches the joint network calls across a segment of time steps.
In addition, aggregating emission probabilities over a segment may be seen as a better approximation to finding the most likely model output.
- Score: 3.682821163882332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithms for
speech recognition are iterating over the time axis, such that one time step is
decoded before moving on to the next time step. Those algorithms result in a
large number of calls to the joint network, which were shown in previous work
to be an important factor that reduces decoding speed. We present a decoding
beam search algorithm that batches the joint network calls across a segment of
time steps, which results in 20%-96% decoding speedups consistently across all
models and settings experimented with. In addition, aggregating emission
probabilities over a segment may be seen as a better approximation to finding
the most likely model output, causing our algorithm to improve oracle word
error rate by up to 11% relative as the segment size increases, and to slightly
improve general word error rate.
Related papers
- Unfolded proximal neural networks for robust image Gaussian denoising [7.018591019975253]
We propose a unified framework to build PNNs for the Gaussian denoising task, based on both the dual-FB and the primal-dual Chambolle-Pock algorithms.
We also show that accelerated versions of these algorithms enable skip connections in the associated NN layers.
arXiv Detail & Related papers (2023-08-06T15:32:16Z) - Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR
Decomposition [77.4863142882136]
Cross-encoder models are prohibitively expensive for direct k-nearest neighbor (k-NN) search.
We propose ADACUR, a method that adaptively, iteratively, and efficiently minimizes the approximation error for the practically important top-k neighbors.
arXiv Detail & Related papers (2023-05-04T17:01:17Z) - Rapid Person Re-Identification via Sub-space Consistency Regularization [51.76876061721556]
Person Re-Identification (ReID) matches pedestrians across disjoint cameras.
Existing ReID methods adopting real-value feature descriptors have achieved high accuracy, but they are low in efficiency due to the slow Euclidean distance computation.
We propose a novel Sub-space Consistency Regularization (SCR) algorithm that can speed up the ReID procedure by 0.25$ times.
arXiv Detail & Related papers (2022-07-13T02:44:05Z) - OMPQ: Orthogonal Mixed Precision Quantization [64.59700856607017]
Mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization.
We propose to optimize a proxy metric, the concept of networkity, which is highly correlated with the loss of the integer programming.
This approach reduces the search time and required data amount by orders of magnitude, with little compromise on quantization accuracy.
arXiv Detail & Related papers (2021-09-16T10:59:33Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z) - Channel Assignment in Uplink Wireless Communication using Machine
Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems.
Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints.
Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.