Related papers: Levi Graph AMR Parser using Heterogeneous Attention

Levi Graph AMR Parser using Heterogeneous Attention

URL: http://arxiv.org/abs/2107.04152v1
Date: Fri, 9 Jul 2021 00:06:17 GMT
Title: Levi Graph AMR Parser using Heterogeneous Attention
Authors: Han He, Jinho D. Choi
Abstract summary: This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention. Although our models use significantly fewer parameters than the previous state-of-the-art graph, they show similar or better accuracy on AMR 2.0 and 3.0.
Score: 17.74208462902158
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Coupled with biaffine decoders, transformers have been effectively adapted to text-to-graph transduction and achieved state-of-the-art performance on AMR parsing. Many prior works, however, rely on the biaffine decoder for either or both arc and label predictions although most features used by the decoder may be learned by the transformer already. This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention, and use only attention matrices from the transformer to predict all elements in AMR graphs (concepts, arcs, labels). Although our models use significantly fewer parameters than the previous state-of-the-art graph parser, they show similar or better accuracy on AMR 2.0 and 3.0.

Related papers

Simplifying Graph Transformers [64.50059165186701]
We propose three simple modifications to the plain Transformer to render it applicable to graphs without introducing major architectural distortions. Specifically, we advocate for the use of (1) simplified $L$ attention to measure the magnitude of closeness tokens; (2) adaptive root-mean-square normalization to preserve token magnitude information; and (3) a relative positional encoding bias with a shared encoder.
arXiv Detail & Related papers (2025-04-17T02:06:50Z)
Variable-size Symmetry-based Graph Fourier Transforms for image compression [65.7352685872625]
We propose a new family of Symmetry-based Graph Fourier Transforms of variable sizes into a coding framework. Our proposed algorithm generates symmetric graphs on the grid by adding specific symmetrical connections between nodes. Experiments show that SBGFTs outperform the primary transforms integrated in the explicit Multiple Transform Selection.
arXiv Detail & Related papers (2024-11-24T13:00:44Z)
Graph Transformers Dream of Electric Flow [72.06286909236827]
We show that the linear Transformer, when applied to graph data, can implement algorithms that solve canonical problems. We present explicit weight configurations for implementing each such graph algorithm, and we bound the errors of the constructed Transformers by the errors of the underlying algorithms.
arXiv Detail & Related papers (2024-10-22T05:11:45Z)
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent [51.50999191584981]
Sign Gradient Descent (SignGD) serves as an effective surrogate for Adam. We study how SignGD optimize a two-layer transformer on a noisy dataset. We find that the poor generalization of SignGD is not solely due to data noise, suggesting that both SignGD and Adam requires high-quality data for real-world tasks.
arXiv Detail & Related papers (2024-10-07T09:36:43Z)
Comparing Graph Transformers via Positional Encodings [11.5844121984212]
The distinguishing power of graph transformers is closely tied to the choice of positional encoding. There are two primary types of positional encoding: absolute positional encodings (APEs) and relative positional encodings (RPEs) We show that graph transformers using APEs and RPEs are equivalent in terms of distinguishing power.
arXiv Detail & Related papers (2024-02-22T01:07:48Z)
Key-Value Transformer [47.64219291655723]
Key-value formulation (KV) generates symmetric attention maps, along with an asymmetric version that incorporates a 2D positional encoding into the attention matrix. Experiments encompass three task types -- synthetics (such as reversing or sorting a list), vision (mnist or cifar classification), and NLP.
arXiv Detail & Related papers (2023-05-28T20:26:06Z)
Transformers Meet Directed Graphs [18.490890946129284]
Transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains. In this work, we propose two direction- and structure-aware positional encodings for directed graphs. We show that the extra directionality information is useful in various downstream tasks, including correctness testing of sorting networks and source code understanding.
arXiv Detail & Related papers (2023-01-31T19:33:14Z)
SepTr: Separable Transformer for Audio Spectrogram Processing [74.41172054754928]
We propose a new vision transformer architecture called Separable Transformer (SepTr) SepTr employs two transformer blocks in a sequential manner, the first attending to tokens within the same frequency bin, and the second attending to tokens within the same time interval. We conduct experiments on three benchmark data sets, showing that our architecture outperforms conventional vision transformers and other state-of-the-art methods.
arXiv Detail & Related papers (2022-03-17T19:48:43Z)
Graph Masked Autoencoder [19.080326439575916]
We propose Graph Masked Autoencoders (GMAE), a self-supervised model for learning graph representations. GMAE takes partially masked graphs as input, and reconstructs the features of the masked nodes. We show that, compared with training from scratch, the graph transformer pre-trained using GMAE can achieve much better performance after fine-tuning.
arXiv Detail & Related papers (2022-02-17T01:04:32Z)
Transformer-Based Deep Image Matching for Generalizable Person Re-identification [114.56752624945142]
We investigate the possibility of applying Transformers for image matching and metric learning given pairs of images. We find that the Vision Transformer (ViT) and the vanilla Transformer with decoders are not adequate for image matching due to their lack of image-to-image attention. We propose a new simplified decoder, which drops the full attention implementation with the softmax weighting, keeping only the query-key similarity.
arXiv Detail & Related papers (2021-05-30T05:38:33Z)
AMR Parsing with Action-Pointer Transformer [18.382148821100152]
We propose a transition-based system that combines hard-attention over sentences with a target-side action pointer mechanism. We show that our action-pointer approach leads to increased expressiveness and attains large gains against the best transition-based AMR.
arXiv Detail & Related papers (2021-04-29T22:01:41Z)
Online Back-Parsing for AMR-to-Text Generation [29.12944601513491]
AMR-to-text generation aims to recover a text containing the same meaning as an input AMR graph. We propose a decoder that back predicts projected AMR graphs on the target sentence during text generation.
arXiv Detail & Related papers (2020-10-09T12:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.