Efficient Encoders for Streaming Sequence Tagging
- URL: http://arxiv.org/abs/2301.09244v1
- Date: Mon, 23 Jan 2023 02:20:39 GMT
- Title: Efficient Encoders for Streaming Sequence Tagging
- Authors: Ayush Kaushal, Aditya Gupta, Shyam Upadhyay, Manaal Faruqui
- Abstract summary: A naive application of state-of-the-art bidirectional encoders for streaming sequence tagging would require encoding each token from scratch for each new token in an incremental streaming input (like transcribed speech)
The lack of re-usability of previous computation leads to a higher number of Floating Point Operations (or FLOPs) and higher number of unnecessary label flips.
We present a Hybrid with Adaptive Restart (HEAR) that addresses these issues while maintaining the performance of bidirectional encoders over the offline (or complete) inputs.
- Score: 13.692806815196077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A naive application of state-of-the-art bidirectional encoders for streaming
sequence tagging would require encoding each token from scratch for each new
token in an incremental streaming input (like transcribed speech). The lack of
re-usability of previous computation leads to a higher number of Floating Point
Operations (or FLOPs) and higher number of unnecessary label flips. Increased
FLOPs consequently lead to higher wall-clock time and increased label flipping
leads to poorer streaming performance. In this work, we present a Hybrid
Encoder with Adaptive Restart (HEAR) that addresses these issues while
maintaining the performance of bidirectional encoders over the offline (or
complete) inputs while improving performance on streaming (or incomplete)
inputs. HEAR has a Hybrid unidirectional-bidirectional encoder architecture to
perform sequence tagging, along with an Adaptive Restart Module (ARM) to
selectively guide the restart of bidirectional portion of the encoder. Across
four sequence tagging tasks, HEAR offers FLOP savings in streaming settings
upto 71.1% and also outperforms bidirectional encoders for streaming
predictions by upto +10% streaming exact match.
Related papers
- Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks [53.550782959908524]
We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks.
Our method, prompt-in-decoder (PiD), encodes the input once and decodes the output in parallel, boosting both training and inference efficiency.
arXiv Detail & Related papers (2024-03-19T19:27:23Z) - Streaming Sequence Transduction through Dynamic Compression [55.0083843520833]
We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams.
STAR dynamically segments input streams to create compressed anchor representations, achieving nearly lossless compression (12x) in Automatic Speech Recognition (ASR)
STAR demonstrates superior segmentation and latency-quality trade-offs in simultaneous speech-to-text tasks, optimizing latency, memory footprint, and quality.
arXiv Detail & Related papers (2024-02-02T06:31:50Z) - DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder
Transformer Models [22.276574156358084]
We build a multi-exit encoder-decoder transformer model which is trained with deep supervision so that each of its decoder layers is capable of generating plausible predictions.
We show our approach can reduce overall inference latency by 30%-60% with comparable or even higher accuracy compared to baselines.
arXiv Detail & Related papers (2023-11-15T01:01:02Z) - Streaming parallel transducer beam search with fast-slow cascaded
encoders [23.416682253435837]
Streaming and non-streaming ASR for RNN Transducers can be unified by cascading causal and non-causal encoders.
We propose a novel parallel time-synchronous beam search algorithm for transducers that decodes from fast-slow encoders.
arXiv Detail & Related papers (2022-03-29T17:29:39Z) - Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input [54.82369261350497]
We propose a CTC-enhanced NAR transformer, which generates target sequence by refining predictions of the CTC module.
Experimental results show that our method outperforms all previous NAR counterparts and achieves 50x faster decoding speed than a strong AR baseline with only 0.0 0.3 absolute CER degradation on Aishell-1 and Aishell-2 datasets.
arXiv Detail & Related papers (2020-10-28T15:00:09Z) - Cascaded encoders for unifying streaming and non-streaming ASR [68.62941009369125]
This work presents cascaded encoders for building a single E2E ASR model that can operate in both these modes simultaneously.
A single decoder then learns to decode either using the output of the streaming or the non-streaming encoder.
Results show that this model achieves similar word error rates (WER) as a standalone streaming model when operating in streaming mode, and obtains 10% -- 27% relative improvement when operating in non-streaming mode.
arXiv Detail & Related papers (2020-10-27T20:59:50Z) - Fast Interleaved Bidirectional Sequence Generation [90.58793284654692]
We introduce a decoder that generates target words from the left-to-right and right-to-left directions simultaneously.
We show that we can easily convert a standard architecture for unidirectional decoding into a bidirectional decoder.
Our interleaved bidirectional decoder (IBDecoder) retains the model simplicity and training efficiency of the standard Transformer.
arXiv Detail & Related papers (2020-10-27T17:38:51Z) - On Sparsifying Encoder Outputs in Sequence-to-Sequence Models [90.58793284654692]
We take Transformer as the testbed and introduce a layer of gates in-between the encoder and the decoder.
The gates are regularized using the expected value of the sparsity-inducing L0penalty.
We investigate the effects of this sparsification on two machine translation and two summarization tasks.
arXiv Detail & Related papers (2020-04-24T16:57:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.