AMR Parsing with Causal Hierarchical Attention and Pointers
- URL: http://arxiv.org/abs/2310.11964v1
- Date: Wed, 18 Oct 2023 13:44:26 GMT
- Title: AMR Parsing with Causal Hierarchical Attention and Pointers
- Authors: Chao Lou, Kewei Tu
- Abstract summary: We introduce new target forms of AMR parsing and a novel model, CHAP, which is equipped with causal hierarchical attention and the pointer mechanism.
Experiments show that our model outperforms baseline models on four out of five benchmarks in the setting of no additional data.
- Score: 54.382865897298046
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Translation-based AMR parsers have recently gained popularity due to their
simplicity and effectiveness. They predict linearized graphs as free texts,
avoiding explicit structure modeling. However, this simplicity neglects
structural locality in AMR graphs and introduces unnecessary tokens to
represent coreferences. In this paper, we introduce new target forms of AMR
parsing and a novel model, CHAP, which is equipped with causal hierarchical
attention and the pointer mechanism, enabling the integration of structures
into the Transformer decoder. We empirically explore various alternative
modeling options. Experiments show that our model outperforms baseline models
on four out of five benchmarks in the setting of no additional data.
Related papers
- Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation [54.50526986788175]
Recent advances in efficient sequence modeling have led to attention-free layers, such as Mamba, RWKV, and various gated RNNs.
We present a unified view of these models, formulating such layers as implicit causal self-attention layers.
Our framework compares the underlying mechanisms on similar grounds for different layers and provides a direct means for applying explainability methods.
arXiv Detail & Related papers (2024-05-26T09:57:45Z) - AMRs Assemble! Learning to Ensemble with Autoregressive Models for AMR
Parsing [38.731641198934646]
We show how ensemble models can exploit SMATCH metric weaknesses to obtain higher scores, but sometimes result in corrupted graphs.
We propose two novel ensemble strategies based on Transformer models, improving robustness to structural constraints, while also reducing computational time.
arXiv Detail & Related papers (2023-06-19T08:58:47Z) - An AMR-based Link Prediction Approach for Document-level Event Argument
Extraction [51.77733454436013]
Recent works have introduced Abstract Meaning Representation (AMR) for Document-level Event Argument Extraction (Doc-level EAE)
This work reformulates EAE as a link prediction problem on AMR graphs.
We propose a novel graph structure, Tailored AMR Graph (TAG), which compresses less informative subgraphs and edge types, integrates span information, and highlights surrounding events in the same document.
arXiv Detail & Related papers (2023-05-30T16:07:48Z) - DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z) - The m-connecting imset and factorization for ADMG models [10.839217026568784]
Acyclic directed mixed graph (ADMG) models characterize margins of DAG models.
ADMG models have not seen wide-spread use due to their complexity and a shortage of statistical tools for their analysis.
We introduce the m-connecting imset which provides an alternative representation for the independence models induced by ADMGs.
arXiv Detail & Related papers (2022-07-18T22:29:15Z) - Structure-aware Fine-tuning of Sequence-to-sequence Transformers for
Transition-based AMR Parsing [20.67024416678313]
We explore the integration of general pre-trained sequence-to-sequence language models and a structure-aware transition-based approach.
We propose a simplified transition set, designed to better exploit pre-trained language models for structured fine-tuning.
We show that the proposed parsing architecture retains the desirable properties of previous transition-based approaches, while being simpler and reaching the new state of the art for AMR 2.0, without the need for graph re-categorization.
arXiv Detail & Related papers (2021-10-29T04:36:31Z) - Equivalence of Segmental and Neural Transducer Modeling: A Proof of
Concept [56.46135010588918]
We prove that the widely used class of RNN-Transducer models and segmental models (direct HMM) are equivalent.
It is shown that blank probabilities translate into segment length probabilities and vice versa.
arXiv Detail & Related papers (2021-04-13T11:20:48Z) - Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
Generation [56.73834525802723]
Lightweight Dynamic Graph Convolutional Networks (LDGCNs) are proposed.
LDGCNs capture richer non-local interactions by synthesizing higher order information from the input graphs.
We develop two novel parameter saving strategies based on the group graph convolutions and weight tied convolutions to reduce memory usage and model complexity.
arXiv Detail & Related papers (2020-10-09T06:03:46Z) - GPT-too: A language-model-first approach for AMR-to-text generation [22.65728041544785]
We propose an approach that combines a strong pre-trained language model with cycle consistency-based re-scoring.
Despite the simplicity of the approach, our experimental results show these models outperform all previous techniques.
arXiv Detail & Related papers (2020-05-18T22:50:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.