Recurrent autoencoder with sequence-aware encoding
- URL: http://arxiv.org/abs/2009.07349v3
- Date: Fri, 15 Jan 2021 14:22:43 GMT
- Title: Recurrent autoencoder with sequence-aware encoding
- Authors: Robert Susik
- Abstract summary: We propose an autoencoder architecture with sequence-aware encoding, which employs 1D convolutional layer to improve its performance.
We prove that the proposed solution dominates over the standard RAE, and the training process is order of magnitude faster.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent Neural Networks (RNN) received a vast amount of attention last
decade. Recently, the architectures of Recurrent AutoEncoders (RAE) found many
applications in practice. RAE can extract the semantically valuable
information, called context that represents a latent space useful for further
processing. Nevertheless, recurrent autoencoders are hard to train, and the
training process takes much time. In this paper, we propose an autoencoder
architecture with sequence-aware encoding, which employs 1D convolutional layer
to improve its performance in terms of model training time. We prove that the
recurrent autoencoder with sequence-aware encoding outperforms a standard RAE
in terms of training speed in most cases. The preliminary results show that the
proposed solution dominates over the standard RAE, and the training process is
order of magnitude faster.
Related papers
- Hierarchical Attention Encoder Decoder [2.4366811507669115]
Autoregressive modeling can generate complex and novel sequences that have many real-world applications.
These models must generate outputs autoregressively, which becomes time-consuming when dealing with long sequences.
We propose a model based on the Hierarchical Recurrent Decoder architecture.
arXiv Detail & Related papers (2023-06-01T18:17:23Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - KRNet: Towards Efficient Knowledge Replay [50.315451023983805]
A knowledge replay technique has been widely used in many tasks such as continual learning and continuous domain adaptation.
We propose a novel and efficient knowledge recording network (KRNet) which directly maps an arbitrary sample identity number to the corresponding datum.
Our KRNet requires significantly less storage cost for the latent codes and can be trained without the encoder sub-network.
arXiv Detail & Related papers (2022-05-23T08:34:17Z) - TCTN: A 3D-Temporal Convolutional Transformer Network for Spatiotemporal
Predictive Learning [1.952097552284465]
We propose an algorithm named 3D-temporal convolutional transformer (TCTN), where a transformer-based encoder with temporal convolutional layers is employed to capture short-term and long-term dependencies.
Our proposed algorithm can be easy to implement and trained much faster compared with RNN-based methods thanks to the parallel mechanism of Transformer.
arXiv Detail & Related papers (2021-12-02T10:05:01Z) - Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with
Non-Autoregressive Hidden Intermediates [59.678108707409606]
We propose Fast-MD, a fast MD model that generates HI by non-autoregressive decoding based on connectionist temporal classification (CTC) outputs followed by an ASR decoder.
Fast-MD achieved about 2x and 4x faster decoding speed than that of the na"ive MD model on GPU and CPU with comparable translation quality.
arXiv Detail & Related papers (2021-09-27T05:21:30Z) - Non-autoregressive End-to-end Speech Translation with Parallel
Autoregressive Rescoring [83.32560748324667]
This article describes an efficient end-to-end speech translation (E2E-ST) framework based on non-autoregressive (NAR) models.
We propose a unified NAR E2E-ST framework called Orthros, which has an NAR decoder and an auxiliary shallow AR decoder on top of the shared encoder.
arXiv Detail & Related papers (2021-09-09T16:50:16Z) - Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder [75.84152924972462]
Many real-world applications use Siamese networks to efficiently match text sequences at scale.
This paper pre-trains language models dedicated to sequence matching in Siamese architectures.
arXiv Detail & Related papers (2021-02-18T08:08:17Z) - Streaming automatic speech recognition with the transformer model [59.58318952000571]
We propose a transformer based end-to-end ASR system for streaming ASR.
We apply time-restricted self-attention for the encoder and triggered attention for the encoder-decoder attention mechanism.
Our proposed streaming transformer architecture achieves 2.8% and 7.2% WER for the "clean" and "other" test data of LibriSpeech.
arXiv Detail & Related papers (2020-01-08T18:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.