Neural Machine Translation with Dynamic Graph Convolutional Decoder
- URL: http://arxiv.org/abs/2305.17698v1
- Date: Sun, 28 May 2023 11:58:07 GMT
- Title: Neural Machine Translation with Dynamic Graph Convolutional Decoder
- Authors: Lei Li, Kai Fan, Lingyu Yang, Hongjia Li, Chun Yuan
- Abstract summary: We propose an end-to-end translation architecture from the (graph & sequence) structural inputs to the (graph & sequence) outputs, where the target translation and its corresponding syntactic graph are jointly modeled and generated.
We conduct extensive experiments on five widely acknowledged translation benchmarks, verifying our proposal achieves consistent improvements over baselines and other syntax-aware variants.
- Score: 32.462919670070654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing wisdom demonstrates the significance of syntactic knowledge for the
improvement of neural machine translation models. However, most previous works
merely focus on leveraging the source syntax in the well-known encoder-decoder
framework. In sharp contrast, this paper proposes an end-to-end translation
architecture from the (graph \& sequence) structural inputs to the (graph \&
sequence) outputs, where the target translation and its corresponding syntactic
graph are jointly modeled and generated. We propose a customized Dynamic
Spatial-Temporal Graph Convolutional Decoder (Dyn-STGCD), which is designed for
consuming source feature representations and their syntactic graph, and
auto-regressively generating the target syntactic graph and tokens
simultaneously. We conduct extensive experiments on five widely acknowledged
translation benchmarks, verifying that our proposal achieves consistent
improvements over baselines and other syntax-aware variants.
Related papers
- Patch-wise Graph Contrastive Learning for Image Translation [69.85040887753729]
We exploit the graph neural network to capture the topology-aware features.
We construct the graph based on the patch-wise similarity from a pretrained encoder.
In order to capture the hierarchical semantic structure, we propose the graph pooling.
arXiv Detail & Related papers (2023-12-13T15:45:19Z) - Syntax-Aware Complex-Valued Neural Machine Translation [14.772317918560548]
We propose a method to incorporate syntax information into a complex-valued-Decoder architecture.
The proposed model jointly learns word-level and syntax-level attention scores from the source side to the target side using an attention mechanism.
The experimental results demonstrate that the proposed method can bring significant improvements in BLEU scores on two datasets.
arXiv Detail & Related papers (2023-07-17T15:58:05Z) - Transforming Visual Scene Graphs to Image Captions [69.13204024990672]
We propose to transform Scene Graphs (TSG) into more descriptive captions.
In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.
In TSG, each expert is built on MHA, for discriminating the graph embeddings to generate different kinds of words.
arXiv Detail & Related papers (2023-05-03T15:18:37Z) - Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
Regularized Encoder-Decoder [75.03283861464365]
The seq2seq task aims at generating the target sequence based on the given input source sequence.
Traditionally, most of the seq2seq task is resolved by an encoder to encode the source sequence and a decoder to generate the target text.
Recently, a bunch of new approaches have emerged that apply decoder-only language models directly to the seq2seq task.
arXiv Detail & Related papers (2023-04-08T15:44:29Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - GN-Transformer: Fusing Sequence and Graph Representation for Improved
Code Summarization [0.0]
We propose a novel method, GN-Transformer, to learn end-to-end on a fused sequence and graph modality.
The proposed methods achieve state-of-the-art performance in two code summarization datasets and across three automatic code summarization metrics.
arXiv Detail & Related papers (2021-11-17T02:51:37Z) - GraphiT: Encoding Graph Structure in Transformers [37.33808493548781]
We show that viewing graphs as sets of node features and structural and positional information is able to outperform representations learned with classical graph neural networks (GNNs)
Our model, GraphiT, encodes such information by (i) leveraging relative positional encoding strategies in self-attention scores based on positive definite kernels on graphs, and (ii) enumerating and encoding local sub-structures such as paths of short length.
arXiv Detail & Related papers (2021-06-10T11:36:22Z) - Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms
Extractionwith Rich Syntactic Knowledge [17.100366742363803]
We propose to enhance the pair-wise aspect and opinion terms extraction (PAOTE) task by incorporating rich syntactic knowledge.
We first build a syntax fusion encoder for encoding syntactic features, including a label-aware graph convolutional network (LAGCN) for modeling the dependency edges and labels.
During pairing, we then adopt Biaffine and Triaffine scoring for high-order aspect-opinion term pairing, in the meantime re-harnessing the syntax-enriched representations in LAGCN for syntactic-aware scoring.
arXiv Detail & Related papers (2021-05-06T08:45:40Z) - Transition based Graph Decoder for Neural Machine Translation [41.7284715234202]
We propose a general Transformer-based approach for tree and graph decoding based on generating a sequence of transitions.
We show improved performance over the standard Transformer decoder, as well as over ablated versions of the model.
arXiv Detail & Related papers (2021-01-29T15:20:45Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Bi-Decoder Augmented Network for Neural Machine Translation [108.3931242633331]
We propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task.
Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space.
arXiv Detail & Related papers (2020-01-14T02:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.