Accurate Word Alignment Induction from Neural Machine Translation
- URL: http://arxiv.org/abs/2004.14837v2
- Date: Thu, 3 Dec 2020 01:57:01 GMT
- Title: Accurate Word Alignment Induction from Neural Machine Translation
- Authors: Yun Chen, Yang Liu, Guanhua Chen, Xin Jiang, Qun Liu
- Abstract summary: We propose two novel word alignment induction methods Shift-Att and Shift-AET.
The main idea is to induce alignments at the step when the to-be-aligned target token is the decoder input.
Experiments on three publicly available datasets demonstrate that both methods perform better than their corresponding neural baselines.
- Score: 33.21196289328584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite its original goal to jointly learn to align and translate, prior
researches suggest that Transformer captures poor word alignments through its
attention mechanism. In this paper, we show that attention weights DO capture
accurate word alignments and propose two novel word alignment induction methods
Shift-Att and Shift-AET. The main idea is to induce alignments at the step when
the to-be-aligned target token is the decoder input rather than the decoder
output as in previous work. Shift-Att is an interpretation method that induces
alignments from the attention weights of Transformer and does not require
parameter update or architecture change. Shift-AET extracts alignments from an
additional alignment module which is tightly integrated into Transformer and
trained in isolation with supervision from symmetrized Shift-Att alignments.
Experiments on three publicly available datasets demonstrate that both methods
perform better than their corresponding neural baselines and Shift-AET
significantly outperforms GIZA++ by 1.4-4.8 AER points.
Related papers
- Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study [52.91899050612153]
transformers within pre-trained language models (PLMs) when repurposed as encoders for Automatic Speech Recognition (ASR)
Our findings reveal a notable improvement in Character Error Rate (CER) and Word Error Rate (WER) across diverse ASR tasks when transformers from pre-trained LMs are incorporated.
This underscores the potential of leveraging the semantic prowess embedded within pre-trained transformers to advance ASR systems' capabilities.
arXiv Detail & Related papers (2024-09-26T11:31:18Z) - A CTC Alignment-based Non-autoregressive Transformer for End-to-end
Automatic Speech Recognition [26.79184118279807]
We present a CTC Alignment-based Single-Step Non-Autoregressive Transformer (CASS-NAT) for end-to-end ASR.
word embeddings in the autoregressive transformer (AT) are substituted with token-level acoustic embeddings (TAE) that are extracted from encoder outputs.
We find that CASS-NAT has a WER that is close to AT on various ASR tasks, while providing a 24x inference speedup.
arXiv Detail & Related papers (2023-04-15T18:34:29Z) - Inducing and Using Alignments for Transition-based AMR Parsing [51.35194383275297]
We propose a neural aligner for AMR that learns node-to-word alignments without relying on complex pipelines.
We attain a new state-of-the art for gold-only trained models, matching silver-trained performance without the need for beam search on AMR3.0.
arXiv Detail & Related papers (2022-05-03T12:58:36Z) - AMR Parsing with Action-Pointer Transformer [18.382148821100152]
We propose a transition-based system that combines hard-attention over sentences with a target-side action pointer mechanism.
We show that our action-pointer approach leads to increased expressiveness and attains large gains against the best transition-based AMR.
arXiv Detail & Related papers (2021-04-29T22:01:41Z) - Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and
Backward Transformers [49.403414751667135]
This paper proposes a novel label-synchronous speech-to-text alignment technique for automatic speech recognition (ASR)
The proposed method re-defines the speech-to-text alignment as a label-synchronous text mapping problem.
Experiments using the corpus of spontaneous Japanese (CSJ) demonstrate that the proposed method provides an accurate utterance-wise alignment.
arXiv Detail & Related papers (2021-04-21T03:05:12Z) - Demystifying the Better Performance of Position Encoding Variants for
Transformer [12.503079503907989]
We show how to encode position and segment into Transformer models.
The proposed method performs on par with SOTA on GLUE, XTREME and WMT benchmarks while saving costs.
arXiv Detail & Related papers (2021-04-18T03:44:57Z) - Leveraging Neural Machine Translation for Word Alignment [0.0]
A machine translation (MT) system is able to produce word-alignments using the trained attention heads.
This is convenient because word-alignment is theoretically a viable byproduct of any attention-based NMT.
We summarize different approaches on how word-alignment can be extracted from alignment scores and then explore ways in which scores can be extracted from NMT.
arXiv Detail & Related papers (2021-03-31T17:51:35Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
Translation [73.11214377092121]
We propose to replace all but one attention head of each encoder layer with simple fixed -- non-learnable -- attentive patterns.
Our experiments with different data sizes and multiple language pairs show that fixing the attention heads on the encoder side of the Transformer at training time does not impact the translation quality.
arXiv Detail & Related papers (2020-02-24T13:53:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.