SG-Net: Syntax Guided Transformer for Language Representation
- URL: http://arxiv.org/abs/2012.13915v2
- Date: Thu, 7 Jan 2021 05:48:45 GMT
- Title: SG-Net: Syntax Guided Transformer for Language Representation
- Authors: Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang
- Abstract summary: We propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanisms for better linguistically motivated word representations.
In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention.
Experiments on popular benchmark tasks, including machine reading comprehension, natural language inference, and neural machine translation show the effectiveness of the proposed SG-Net design.
- Score: 58.35672033887343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding human language is one of the key themes of artificial
intelligence. For language representation, the capacity of effectively modeling
the linguistic knowledge from the detail-riddled and lengthy texts and getting
rid of the noises is essential to improve its performance. Traditional
attentive models attend to all words without explicit constraint, which results
in inaccurate concentration on some dispensable words. In this work, we propose
using syntax to guide the text modeling by incorporating explicit syntactic
constraints into attention mechanisms for better linguistically motivated word
representations. In detail, for self-attention network (SAN) sponsored
Transformer-based encoder, we introduce syntactic dependency of interest (SDOI)
design into the SAN to form an SDOI-SAN with syntax-guided self-attention.
Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the
SAN from the original Transformer encoder through a dual contextual
architecture for better linguistics inspired representation. The proposed
SG-Net is applied to typical Transformer encoders. Extensive experiments on
popular benchmark tasks, including machine reading comprehension, natural
language inference, and neural machine translation show the effectiveness of
the proposed SG-Net design.
Related papers
- SignAttention: On the Interpretability of Transformer Models for Sign Language Translation [2.079808290618441]
This paper presents the first comprehensive interpretability analysis of a Transformer-based Sign Language Translation model.
We examine the attention mechanisms within the model to understand how it processes and aligns visual input with sequential glosses.
This work contributes to a deeper understanding of SLT models, paving the way for the development of more transparent and reliable translation systems.
arXiv Detail & Related papers (2024-10-18T14:38:37Z) - Language-Oriented Communication with Semantic Coding and Knowledge
Distillation for Text-to-Image Generation [53.97155730116369]
We put forward a novel framework of language-oriented semantic communication (LSC)
In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency.
We introduce three innovative algorithms: 1) semantic source coding (SSC), which compresses a text prompt into its key head words capturing the prompt's syntactic essence; 2) semantic channel coding ( SCC), that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD), that produces listener-customized prompts via in-context learning the listener's
arXiv Detail & Related papers (2023-09-20T08:19:05Z) - Bridging the Modality Gap for Speech-to-Text Translation [57.47099674461832]
End-to-end speech translation aims to translate speech in one language into text in another language via an end-to-end way.
Most existing methods employ an encoder-decoder structure with a single encoder to learn acoustic representation and semantic information simultaneously.
We propose a Speech-to-Text Adaptation for Speech Translation model which aims to improve the end-to-end model performance by bridging the modality gap between speech and text.
arXiv Detail & Related papers (2020-10-28T12:33:04Z) - How Does Selective Mechanism Improve Self-Attention Networks? [57.75314746470783]
Self-attention networks (SANs) with selective mechanism has produced substantial improvements in various NLP tasks.
In this paper, we assess the strengths of selective SANs, which are implemented with a flexible and universal Gumbel-Softmax.
We empirically validate that the improvement of SSANs can be attributed in part to mitigating two commonly-cited weaknesses of SANs: word order encoding and structure modeling.
arXiv Detail & Related papers (2020-05-03T04:18:44Z) - Semantics-Aware Inferential Network for Natural Language Understanding [79.70497178043368]
We propose a Semantics-Aware Inferential Network (SAIN) to meet such a motivation.
Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues.
Our model achieves significant improvement on 11 tasks including machine reading comprehension and natural language inference.
arXiv Detail & Related papers (2020-04-28T07:24:43Z) - Bi-Decoder Augmented Network for Neural Machine Translation [108.3931242633331]
We propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task.
Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space.
arXiv Detail & Related papers (2020-01-14T02:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.