Levi Graph AMR Parser using Heterogeneous Attention
- URL: http://arxiv.org/abs/2107.04152v1
- Date: Fri, 9 Jul 2021 00:06:17 GMT
- Title: Levi Graph AMR Parser using Heterogeneous Attention
- Authors: Han He, Jinho D. Choi
- Abstract summary: This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention.
Although our models use significantly fewer parameters than the previous state-of-the-art graph, they show similar or better accuracy on AMR 2.0 and 3.0.
- Score: 17.74208462902158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coupled with biaffine decoders, transformers have been effectively adapted to
text-to-graph transduction and achieved state-of-the-art performance on AMR
parsing. Many prior works, however, rely on the biaffine decoder for either or
both arc and label predictions although most features used by the decoder may
be learned by the transformer already. This paper presents a novel approach to
AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one
input to a transformer to learn attention, and use only attention matrices from
the transformer to predict all elements in AMR graphs (concepts, arcs, labels).
Although our models use significantly fewer parameters than the previous
state-of-the-art graph parser, they show similar or better accuracy on AMR 2.0
and 3.0.
Related papers
- Graph Transformers Dream of Electric Flow [72.06286909236827]
We show that the linear Transformer, when applied to graph data, can implement algorithms that solve canonical problems.
We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms.
arXiv Detail & Related papers (2024-10-22T05:11:45Z) - On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent [51.50999191584981]
Sign Gradient Descent (SignGD) serves as an effective surrogate for Adam.
We study how SignGD optimize a two-layer transformer on a noisy dataset.
We find that the poor generalization of SignGD is not solely due to data noise, suggesting that both SignGD and Adam requires high-quality data for real-world tasks.
arXiv Detail & Related papers (2024-10-07T09:36:43Z) - Comparing Graph Transformers via Positional Encodings [11.5844121984212]
The distinguishing power of graph transformers is closely tied to the choice of positional encoding.
There are two primary types of positional encoding: absolute positional encodings (APEs) and relative positional encodings (RPEs)
We show that graph transformers using APEs and RPEs are equivalent in terms of distinguishing power.
arXiv Detail & Related papers (2024-02-22T01:07:48Z) - Key-Value Transformer [47.64219291655723]
Key-value formulation (KV) generates symmetric attention maps, along with an asymmetric version that incorporates a 2D positional encoding into the attention matrix.
Experiments encompass three task types -- synthetics (such as reversing or sorting a list), vision (mnist or cifar classification), and NLP.
arXiv Detail & Related papers (2023-05-28T20:26:06Z) - Graph Inductive Biases in Transformers without Message Passing [47.238185813842996]
New Graph Inductive bias Transformer (GRIT) incorporates graph inductive biases without using message passing.
GRIT achieves state-of-the-art empirical performance across a variety of graph datasets.
arXiv Detail & Related papers (2023-05-27T22:26:27Z) - Transformers Meet Directed Graphs [18.490890946129284]
Transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains.
In this work, we propose two direction- and structure-aware positional encodings for directed graphs.
We show that the extra directionality information is useful in various downstream tasks, including correctness testing of sorting networks and source code understanding.
arXiv Detail & Related papers (2023-01-31T19:33:14Z) - SepTr: Separable Transformer for Audio Spectrogram Processing [74.41172054754928]
We propose a new vision transformer architecture called Separable Transformer (SepTr)
SepTr employs two transformer blocks in a sequential manner, the first attending to tokens within the same frequency bin, and the second attending to tokens within the same time interval.
We conduct experiments on three benchmark data sets, showing that our architecture outperforms conventional vision transformers and other state-of-the-art methods.
arXiv Detail & Related papers (2022-03-17T19:48:43Z) - Graph Masked Autoencoder [19.080326439575916]
We propose Graph Masked Autoencoders (GMAE), a self-supervised model for learning graph representations.
GMAE takes partially masked graphs as input, and reconstructs the features of the masked nodes.
We show that, compared with training from scratch, the graph transformer pre-trained using GMAE can achieve much better performance after fine-tuning.
arXiv Detail & Related papers (2022-02-17T01:04:32Z) - Transformer-Based Deep Image Matching for Generalizable Person
Re-identification [114.56752624945142]
We investigate the possibility of applying Transformers for image matching and metric learning given pairs of images.
We find that the Vision Transformer (ViT) and the vanilla Transformer with decoders are not adequate for image matching due to their lack of image-to-image attention.
We propose a new simplified decoder, which drops the full attention implementation with the softmax weighting, keeping only the query-key similarity.
arXiv Detail & Related papers (2021-05-30T05:38:33Z) - AMR Parsing with Action-Pointer Transformer [18.382148821100152]
We propose a transition-based system that combines hard-attention over sentences with a target-side action pointer mechanism.
We show that our action-pointer approach leads to increased expressiveness and attains large gains against the best transition-based AMR.
arXiv Detail & Related papers (2021-04-29T22:01:41Z) - Online Back-Parsing for AMR-to-Text Generation [29.12944601513491]
AMR-to-text generation aims to recover a text containing the same meaning as an input AMR graph.
We propose a decoder that back predicts projected AMR graphs on the target sentence during text generation.
arXiv Detail & Related papers (2020-10-09T12:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.