Integrating Dependency Tree Into Self-attention for Sentence
Representation
- URL: http://arxiv.org/abs/2203.05918v1
- Date: Fri, 11 Mar 2022 13:44:41 GMT
- Title: Integrating Dependency Tree Into Self-attention for Sentence
Representation
- Authors: Junhua Ma, Jiajun Li, Yuxuan Liu, Shangbo Zhou, Xue Li
- Abstract summary: We propose Dependency-Transformer, which applies a relation-attention mechanism that works in concert with the self-attention mechanism.
By a score-based method, we successfully inject the syntax information without affecting Transformer's parallelizability.
Our model outperforms or is comparable to the state-of-the-art methods on four tasks for sentence representation.
- Score: 9.676884071651205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent progress on parse tree encoder for sentence representation learning is
notable. However, these works mainly encode tree structures recursively, which
is not conducive to parallelization. On the other hand, these works rarely take
into account the labels of arcs in dependency trees. To address both issues, we
propose Dependency-Transformer, which applies a relation-attention mechanism
that works in concert with the self-attention mechanism. This mechanism aims to
encode the dependency and the spatial positional relations between nodes in the
dependency tree of sentences. By a score-based method, we successfully inject
the syntax information without affecting Transformer's parallelizability. Our
model outperforms or is comparable to the state-of-the-art methods on four
tasks for sentence representation and has obvious advantages in computational
efficiency.
Related papers
- Dependency Graph Parsing as Sequence Labeling [18.079016557290338]
We define a range of unbounded and bounded linearizations that can be used to cast graph parsing as a tagging task.
Experimental results on semantic dependency and enhanced UD parsing show that with a good choice of encoding, sequence-labeling dependency graphs combine high efficiency with accuracies close to the state of the art.
arXiv Detail & Related papers (2024-10-23T15:37:02Z) - Entity-Aware Self-Attention and Contextualized GCN for Enhanced Relation Extraction in Long Sentences [5.453850739960517]
We propose a novel model, Entity-aware Self-attention Contextualized GCN (ESC-GCN), which efficiently incorporates syntactic structure of input sentences and semantic context of sequences.
Our model achieves encouraging performance as compared to existing dependency-based and sequence-based models.
arXiv Detail & Related papers (2024-09-15T10:50:51Z) - Structured Dialogue Discourse Parsing [79.37200787463917]
discourse parsing aims to uncover the internal structure of a multi-participant conversation.
We propose a principled method that improves upon previous work from two perspectives: encoding and decoding.
Experiments show that our method achieves new state-of-the-art, surpassing the previous model by 2.3 on STAC and 1.5 on Molweni.
arXiv Detail & Related papers (2023-06-26T22:51:01Z) - Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags.
Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other.
We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - Combining Improvements for Exploiting Dependency Trees in Neural
Semantic Parsing [1.0437764544103274]
In this paper, we examine three methods to incorporate such dependency information in a Transformer based semantic parsing system.
We first replace standard self-attention heads in the encoder with parent-scaled self-attention (PASCAL) heads.
Later, we insert the constituent attention (CA) to the encoder, which adds an extra constraint to attention heads that can better capture the inherent dependency structure of input sentences.
arXiv Detail & Related papers (2021-12-25T03:41:42Z) - Learning compositional structures for semantic graph parsing [81.41592892863979]
We show how AM dependency parsing can be trained directly on a neural latent-variable model.
Our model picks up on several linguistic phenomena on its own and achieves comparable accuracy to supervised training.
arXiv Detail & Related papers (2021-06-08T14:20:07Z) - Coordinate Constructions in English Enhanced Universal Dependencies:
Analysis and Computational Modeling [1.9950682531209154]
We address the representation of coordinate constructions in Enhanced Universal Dependencies (UD)
We create a large-scale dataset of manually edited syntax graphs.
We identify several systematic errors in the original data, and propose to also propagate adjuncts.
arXiv Detail & Related papers (2021-03-16T10:24:27Z) - Please Mind the Root: Decoding Arborescences for Dependency Parsing [67.71280539312536]
We analyze the output of state-of-the-arts on many languages from the Universal Dependency Treebank.
The worst constraint-violation rate we observe is 24%.
arXiv Detail & Related papers (2020-10-06T08:31:14Z) - Transformer-Based Neural Text Generation with Syntactic Guidance [0.0]
We study the problem of using (partial) constituency parse trees as syntactic guidance for controlled text generation.
Our method first expands a partial template parse tree to a full-fledged parse tree tailored for the input source text.
Our experiments in the controlled paraphrasing task show that our method outperforms SOTA models both semantically and syntactically.
arXiv Detail & Related papers (2020-10-05T01:33:58Z) - Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity.
Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.