Related papers: Recurrent autoencoder with sequence-aware encoding

Recurrent autoencoder with sequence-aware encoding

URL: http://arxiv.org/abs/2009.07349v3
Date: Fri, 15 Jan 2021 14:22:43 GMT
Title: Recurrent autoencoder with sequence-aware encoding
Authors: Robert Susik
Abstract summary: We propose an autoencoder architecture with sequence-aware encoding, which employs 1D convolutional layer to improve its performance. We prove that the proposed solution dominates over the standard RAE, and the training process is order of magnitude faster.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent Neural Networks (RNN) received a vast amount of attention last decade. Recently, the architectures of Recurrent AutoEncoders (RAE) found many applications in practice. RAE can extract the semantically valuable information, called context that represents a latent space useful for further processing. Nevertheless, recurrent autoencoders are hard to train, and the training process takes much time. In this paper, we propose an autoencoder architecture with sequence-aware encoding, which employs 1D convolutional layer to improve its performance in terms of model training time. We prove that the recurrent autoencoder with sequence-aware encoding outperforms a standard RAE in terms of training speed in most cases. The preliminary results show that the proposed solution dominates over the standard RAE, and the training process is order of magnitude faster.

Related papers

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers [14.91083492000769]
We show that the transformer-based encoder adopted in recent years is capable of performing the alignment internally during the forward pass. This new phenomenon enables a simpler and more efficient model, the "Aligner-Encoder" We conduct experiments demonstrating performance remarkably close to the state of the art.
arXiv Detail & Related papers (2025-02-06T22:09:52Z)
Hierarchical Attention Encoder Decoder [2.4366811507669115]
Autoregressive modeling can generate complex and novel sequences that have many real-world applications. These models must generate outputs autoregressively, which becomes time-consuming when dealing with long sequences. We propose a model based on the Hierarchical Recurrent Decoder architecture.
arXiv Detail & Related papers (2023-06-01T18:17:23Z)
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning. During inference, we introduce a new generation procedure with a critical sampling strategy. For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)
KRNet: Towards Efficient Knowledge Replay [50.315451023983805]
A knowledge replay technique has been widely used in many tasks such as continual learning and continuous domain adaptation. We propose a novel and efficient knowledge recording network (KRNet) which directly maps an arbitrary sample identity number to the corresponding datum. Our KRNet requires significantly less storage cost for the latent codes and can be trained without the encoder sub-network.
arXiv Detail & Related papers (2022-05-23T08:34:17Z)
TCTN: A 3D-Temporal Convolutional Transformer Network for Spatiotemporal Predictive Learning [1.952097552284465]
We propose an algorithm named 3D-temporal convolutional transformer (TCTN), where a transformer-based encoder with temporal convolutional layers is employed to capture short-term and long-term dependencies. Our proposed algorithm can be easy to implement and trained much faster compared with RNN-based methods thanks to the parallel mechanism of Transformer.
arXiv Detail & Related papers (2021-12-02T10:05:01Z)
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates [59.678108707409606]
We propose Fast-MD, a fast MD model that generates HI by non-autoregressive decoding based on connectionist temporal classification (CTC) outputs followed by an ASR decoder. Fast-MD achieved about 2x and 4x faster decoding speed than that of the na"ive MD model on GPU and CPU with comparable translation quality.
arXiv Detail & Related papers (2021-09-27T05:21:30Z)
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring [83.32560748324667]
This article describes an efficient end-to-end speech translation (E2E-ST) framework based on non-autoregressive (NAR) models. We propose a unified NAR E2E-ST framework called Orthros, which has an NAR decoder and an auxiliary shallow AR decoder on top of the shared encoder.
arXiv Detail & Related papers (2021-09-09T16:50:16Z)
Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder [75.84152924972462]
Many real-world applications use Siamese networks to efficiently match text sequences at scale. This paper pre-trains language models dedicated to sequence matching in Siamese architectures.
arXiv Detail & Related papers (2021-02-18T08:08:17Z)
Streaming automatic speech recognition with the transformer model [59.58318952000571]
We propose a transformer based end-to-end ASR system for streaming ASR. We apply time-restricted self-attention for the encoder and triggered attention for the encoder-decoder attention mechanism. Our proposed streaming transformer architecture achieves 2.8% and 7.2% WER for the "clean" and "other" test data of LibriSpeech.
arXiv Detail & Related papers (2020-01-08T18:58:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.