Transition based Graph Decoder for Neural Machine Translation
- URL: http://arxiv.org/abs/2101.12640v1
- Date: Fri, 29 Jan 2021 15:20:45 GMT
- Title: Transition based Graph Decoder for Neural Machine Translation
- Authors: Leshem Choshen, Omri Abend
- Abstract summary: We propose a general Transformer-based approach for tree and graph decoding based on generating a sequence of transitions.
We show improved performance over the standard Transformer decoder, as well as over ablated versions of the model.
- Score: 41.7284715234202
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While a number of works showed gains from incorporating source-side symbolic
syntactic and semantic structure into neural machine translation (NMT), much
fewer works addressed the decoding of such structure.
We propose a general Transformer-based approach for tree and graph decoding
based on generating a sequence of transitions, inspired by a similar approach
that uses RNNs by Dyer (2016).
Experiments with using the proposed decoder with Universal Dependencies
syntax on English-German, German-English and English-Russian show improved
performance over the standard Transformer decoder, as well as over ablated
versions of the model.\tacltxt{\footnote{All code implementing the presented
models will be released upon acceptance.
Related papers
- Neural Machine Translation with Dynamic Graph Convolutional Decoder [32.462919670070654]
We propose an end-to-end translation architecture from the (graph & sequence) structural inputs to the (graph & sequence) outputs, where the target translation and its corresponding syntactic graph are jointly modeled and generated.
We conduct extensive experiments on five widely acknowledged translation benchmarks, verifying our proposal achieves consistent improvements over baselines and other syntax-aware variants.
arXiv Detail & Related papers (2023-05-28T11:58:07Z) - Transforming Visual Scene Graphs to Image Captions [69.13204024990672]
We propose to transform Scene Graphs (TSG) into more descriptive captions.
In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.
In TSG, each expert is built on MHA, for discriminating the graph embeddings to generate different kinds of words.
arXiv Detail & Related papers (2023-05-03T15:18:37Z) - Is Encoder-Decoder Redundant for Neural Machine Translation? [44.37101354412253]
encoder-decoder architecture is still the de facto neural network architecture for state-of-the-art models.
In this work, we experiment with bilingual translation, translation with additional target monolingual data, and multilingual translation.
This alternative approach performs on par with the baseline encoder-decoder Transformer, suggesting that an encoder-decoder architecture might be redundant for neural machine translation.
arXiv Detail & Related papers (2022-10-21T08:33:55Z) - Transformer with Tree-order Encoding for Neural Program Generation [8.173517923612426]
We introduce a tree-based positional encoding and a shared natural-language subword vocabulary for Transformers.
Our findings suggest that employing a tree-based positional encoding in combination with a shared natural-language subword vocabulary improves generation performance over sequential positional encodings.
arXiv Detail & Related papers (2022-05-30T12:27:48Z) - Sentence Bottleneck Autoencoders from Transformer Language Models [53.350633961266375]
We build a sentence-level autoencoder from a pretrained, frozen transformer language model.
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder.
We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
arXiv Detail & Related papers (2021-08-31T19:39:55Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - Context- and Sequence-Aware Convolutional Recurrent Encoder for Neural
Machine Translation [2.729898906885749]
Existing models use recurrent neural networks to construct the encoder and decoder modules.
In alternative research, the recurrent networks were substituted by convolutional neural networks for capturing the syntactic structure in the input sentence.
We incorporate the goodness of both approaches by proposing a convolutional-recurrent encoder for capturing the context information.
arXiv Detail & Related papers (2021-01-11T17:03:52Z) - GRET: Global Representation Enhanced Transformer [85.58930151690336]
Transformer, based on the encoder-decoder framework, has achieved state-of-the-art performance on several natural language generation tasks.
We propose a novel global representation enhanced Transformer (GRET) to explicitly model global representation in the Transformer network.
arXiv Detail & Related papers (2020-02-24T07:37:17Z) - Bi-Decoder Augmented Network for Neural Machine Translation [108.3931242633331]
We propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task.
Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space.
arXiv Detail & Related papers (2020-01-14T02:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.