Related papers: A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

URL: http://arxiv.org/abs/2004.14516v1
Date: Wed, 29 Apr 2020 23:40:08 GMT
Title: A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT
Authors: Masaaki Nagata, Chousa Katsuki, Masaaki Nishino
Abstract summary: We first formalize a word alignment problem as a collection of independent predictions from a token in the source sentence to a span in the target sentence. We then solve this problem by using multilingual BERT, which is fine-tuned on a manually created gold word alignment data. We show that the proposed method significantly outperformed previous supervised and unsupervised word alignment methods without using any bitexts for pretraining.
Score: 22.701728185474195
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a novel supervised word alignment method based on cross-language span prediction. We first formalize a word alignment problem as a collection of independent predictions from a token in the source sentence to a span in the target sentence. As this is equivalent to a SQuAD v2.0 style question answering task, we then solve this problem by using multilingual BERT, which is fine-tuned on a manually created gold word alignment data. We greatly improved the word alignment accuracy by adding the context of the token to the question. In the experiments using five word alignment datasets among Chinese, Japanese, German, Romanian, French, and English, we show that the proposed method significantly outperformed previous supervised and unsupervised word alignment methods without using any bitexts for pretraining. For example, we achieved an F1 score of 86.7 for the Chinese-English data, which is 13.3 points higher than the previous state-of-the-art supervised methods.

Related papers

How Transliterations Improve Crosslingual Alignment [48.929677368744606]
Recent studies have shown that post-aligning multilingual pretrained language models (mPLMs) using alignment objectives can improve crosslingual alignment. This paper attempts to explicitly evaluate the crosslingual alignment and identify the key elements in transliteration-based approaches that contribute to better performance.
arXiv Detail & Related papers (2024-09-25T20:05:45Z)
WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction [31.96433679860807]
Most existing word alignment methods rely on manual alignment datasets or parallel corpora. We relax the requirement for correct, fully-aligned, and parallel sentences. We then use such a large-scale weakly-supervised dataset for word alignment pre-training via span prediction.
arXiv Detail & Related papers (2023-06-09T03:11:42Z)
Third-Party Aligner for Neural Word Alignments [18.745852103348845]
We propose to use word alignments generated by a third-party word aligner to supervise the neural word alignment training. Experiments show that our approach can surprisingly do self-correction over the third-party supervision. We achieve state-of-the-art word alignment performances, with averagely more than two points lower alignment error rates than the best third-party aligner.
arXiv Detail & Related papers (2022-11-08T12:30:08Z)
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment [49.45399359826453]
Cross-lingual language models are typically pretrained with language modeling on multilingual text or parallel sentences. We introduce denoising word alignment as a new cross-lingual pre-training task. Experimental results show that our method improves cross-lingual transferability on various datasets.
arXiv Detail & Related papers (2021-06-11T13:36:01Z)
Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change. We show that our method can be used for the detection of semantic change with any alignment method. We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z)
Word Alignment by Fine-tuning Embeddings on Parallel Corpora [96.28608163701055]
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs. Recently, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data. In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing
arXiv Detail & Related papers (2021-01-20T17:54:47Z)
Subword Sampling for Low Resource Word Alignment [4.663577299263155]
We propose subword sampling-based alignment of text units. We show that the subword sampling method consistently outperforms word-level alignment on six language pairs.
arXiv Detail & Related papers (2020-12-21T19:47:04Z)
Cross-lingual Alignment Methods for Multilingual BERT: A Comparative Study [2.101267270902429]
We analyse how different forms of cross-lingual supervision and various alignment methods influence the transfer capability of mBERT in zero-shot setting. We find that supervision from parallel corpus is generally superior to dictionary alignments.
arXiv Detail & Related papers (2020-09-29T20:56:57Z)
On the Language Neutrality of Pre-trained Multilingual Representations [70.93503607755055]
We investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics. Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings. We show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences.
arXiv Detail & Related papers (2020-04-09T19:50:32Z)
Multilingual Alignment of Contextual Word Representations [49.42244463346612]
BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model. We introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer. These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.
arXiv Detail & Related papers (2020-02-10T03:27:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.