Related papers: Accurate Word Alignment Induction from Neural Machine Translation

Accurate Word Alignment Induction from Neural Machine Translation

URL: http://arxiv.org/abs/2004.14837v2
Date: Thu, 3 Dec 2020 01:57:01 GMT
Title: Accurate Word Alignment Induction from Neural Machine Translation
Authors: Yun Chen, Yang Liu, Guanhua Chen, Xin Jiang, Qun Liu
Abstract summary: We propose two novel word alignment induction methods Shift-Att and Shift-AET. The main idea is to induce alignments at the step when the to-be-aligned target token is the decoder input. Experiments on three publicly available datasets demonstrate that both methods perform better than their corresponding neural baselines.
Score: 33.21196289328584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite its original goal to jointly learn to align and translate, prior researches suggest that Transformer captures poor word alignments through its attention mechanism. In this paper, we show that attention weights DO capture accurate word alignments and propose two novel word alignment induction methods Shift-Att and Shift-AET. The main idea is to induce alignments at the step when the to-be-aligned target token is the decoder input rather than the decoder output as in previous work. Shift-Att is an interpretation method that induces alignments from the attention weights of Transformer and does not require parameter update or architecture change. Shift-AET extracts alignments from an additional alignment module which is tightly integrated into Transformer and trained in isolation with supervision from symmetrized Shift-Att alignments. Experiments on three publicly available datasets demonstrate that both methods perform better than their corresponding neural baselines and Shift-AET significantly outperforms GIZA++ by 1.4-4.8 AER points.

Related papers

Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study [52.91899050612153]
transformers within pre-trained language models (PLMs) when repurposed as encoders for Automatic Speech Recognition (ASR) Our findings reveal a notable improvement in Character Error Rate (CER) and Word Error Rate (WER) across diverse ASR tasks when transformers from pre-trained LMs are incorporated. This underscores the potential of leveraging the semantic prowess embedded within pre-trained transformers to advance ASR systems' capabilities.
arXiv Detail & Related papers (2024-09-26T11:31:18Z)
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition [26.79184118279807]
We present a CTC Alignment-based Single-Step Non-Autoregressive Transformer (CASS-NAT) for end-to-end ASR. word embeddings in the autoregressive transformer (AT) are substituted with token-level acoustic embeddings (TAE) that are extracted from encoder outputs. We find that CASS-NAT has a WER that is close to AT on various ASR tasks, while providing a 24x inference speedup.
arXiv Detail & Related papers (2023-04-15T18:34:29Z)
Inducing and Using Alignments for Transition-based AMR Parsing [51.35194383275297]
We propose a neural aligner for AMR that learns node-to-word alignments without relying on complex pipelines. We attain a new state-of-the art for gold-only trained models, matching silver-trained performance without the need for beam search on AMR3.0.
arXiv Detail & Related papers (2022-05-03T12:58:36Z)
AMR Parsing with Action-Pointer Transformer [18.382148821100152]
We propose a transition-based system that combines hard-attention over sentences with a target-side action pointer mechanism. We show that our action-pointer approach leads to increased expressiveness and attains large gains against the best transition-based AMR.
arXiv Detail & Related papers (2021-04-29T22:01:41Z)
Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers [49.403414751667135]
This paper proposes a novel label-synchronous speech-to-text alignment technique for automatic speech recognition (ASR) The proposed method re-defines the speech-to-text alignment as a label-synchronous text mapping problem. Experiments using the corpus of spontaneous Japanese (CSJ) demonstrate that the proposed method provides an accurate utterance-wise alignment.
arXiv Detail & Related papers (2021-04-21T03:05:12Z)
Demystifying the Better Performance of Position Encoding Variants for Transformer [12.503079503907989]
We show how to encode position and segment into Transformer models. The proposed method performs on par with SOTA on GLUE, XTREME and WMT benchmarks while saving costs.
arXiv Detail & Related papers (2021-04-18T03:44:57Z)
Leveraging Neural Machine Translation for Word Alignment [0.0]
A machine translation (MT) system is able to produce word-alignments using the trained attention heads. This is convenient because word-alignment is theoretically a viable byproduct of any attention-based NMT. We summarize different approaches on how word-alignment can be extracted from alignment scores and then explore ways in which scores can be extracted from NMT.
arXiv Detail & Related papers (2021-03-31T17:51:35Z)
Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency. We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT. The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z)
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation [73.11214377092121]
We propose to replace all but one attention head of each encoder layer with simple fixed -- non-learnable -- attentive patterns. Our experiments with different data sizes and multiple language pairs show that fixing the attention heads on the encoder side of the Transformer at training time does not impact the translation quality.
arXiv Detail & Related papers (2020-02-24T13:53:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.