Improving BERT with Syntax-aware Local Attention
- URL: http://arxiv.org/abs/2012.15150v1
- Date: Wed, 30 Dec 2020 13:29:58 GMT
- Title: Improving BERT with Syntax-aware Local Attention
- Authors: Zhongli Li, Qingyu Zhou, Chao Li, Ke Xu, Yunbo Cao
- Abstract summary: We propose a syntax-aware local attention, where the attention scopes are based on the distances in the syntactic structure.
We conduct experiments on various single-sentence benchmarks, including sentence classification and sequence labeling tasks.
Our model achieves better performance owing to more focused attention over syntactically relevant words.
- Score: 14.70545694771721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained Transformer-based neural language models, such as BERT, have
achieved remarkable results on varieties of NLP tasks. Recent works have shown
that attention-based models can benefit from more focused attention over local
regions. Most of them restrict the attention scope within a linear span, or
confine to certain tasks such as machine translation and question answering. In
this paper, we propose a syntax-aware local attention, where the attention
scopes are restrained based on the distances in the syntactic structure. The
proposed syntax-aware local attention can be integrated with pretrained
language models, such as BERT, to render the model to focus on syntactically
relevant words. We conduct experiments on various single-sentence benchmarks,
including sentence classification and sequence labeling tasks. Experimental
results show consistent gains over BERT on all benchmark datasets. The
extensive studies verify that our model achieves better performance owing to
more focused attention over syntactically relevant words.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.