Neural Baselines for Word Alignment
- URL: http://arxiv.org/abs/2009.13116v1
- Date: Mon, 28 Sep 2020 07:51:03 GMT
- Title: Neural Baselines for Word Alignment
- Authors: Anh Khoa Ngo Ho (LIMSI), Fran\c{c}ois Yvon
- Abstract summary: We study and evaluate neural models for unsupervised word alignment for four language pairs.
We show that neural versions of the IBM-1 and hidden Markov models vastly outperform their discrete counterparts.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word alignments identify translational correspondences between words in a
parallel sentence pair and is used, for instance, to learn bilingual
dictionaries, to train statistical machine translation systems , or to perform
quality estimation. In most areas of natural language processing, neural
network models nowadays constitute the preferred approach, a situation that
might also apply to word alignment models. In this work, we study and
comprehensively evaluate neural models for unsupervised word alignment for four
language pairs, contrasting several variants of neural models. We show that in
most settings, neural versions of the IBM-1 and hidden Markov models vastly
outperform their discrete counterparts. We also analyze typical alignment
errors of the baselines that our models overcome to illustrate the benefits-and
the limitations-of these new models for morphologically rich languages.
Related papers
- Pairing Orthographically Variant Literary Words to Standard Equivalents
Using Neural Edit Distance Models [0.0]
We present a novel corpus consisting of orthographically variant words found in works of 19th century U.S. literature annotated with their corresponding "standard" word pair.
We train a set of neural edit distance models to pair these variants with their standard forms, and compare the performance of these models to the performance of a set of neural edit distance models trained on a corpus of orthographic errors made by L2 English learners.
arXiv Detail & Related papers (2024-01-26T18:49:34Z) - Modeling Target-Side Morphology in Neural Machine Translation: A
Comparison of Strategies [72.56158036639707]
Morphologically rich languages pose difficulties to machine translation.
A large amount of differently inflected word surface forms entails a larger vocabulary.
Some inflected forms of infrequent terms typically do not appear in the training corpus.
Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence.
arXiv Detail & Related papers (2022-03-25T10:13:20Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Reranking Machine Translation Hypotheses with Structured and Web-based
Language Models [11.363601836199331]
Two structured language models are applied for N-best rescoring.
We find that the combination of these language models increases the BLEU score up to 1.6% absolutely on blind test sets.
arXiv Detail & Related papers (2021-04-25T22:09:03Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Generative latent neural models for automatic word alignment [0.0]
Variational autoencoders have been recently used in various of natural language processing to learn in an unsupervised way latent representations that are useful for language generation tasks.
In this paper, we study these models for the task of word alignment and propose and assess several evolutions of a vanilla variational autoencoders.
We demonstrate that these techniques can yield competitive results as compared to Giza++ and to a strong neural network alignment system for two language pairs.
arXiv Detail & Related papers (2020-09-28T07:54:09Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - Overestimation of Syntactic Representationin Neural Language Models [16.765097098482286]
One popular method for determining a model's ability to induce syntactic structure trains a model on strings generated according to a template then tests the model's ability to distinguish such strings from superficially similar ones with different syntax.
We illustrate a fundamental problem with this approach by reproducing positive results from a recent paper with two non-syntactic baseline language models.
arXiv Detail & Related papers (2020-04-10T15:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.