Falling Through the Gaps: Neural Architectures as Models of
Morphological Rule Learning
- URL: http://arxiv.org/abs/2105.03710v1
- Date: Sat, 8 May 2021 14:48:29 GMT
- Title: Falling Through the Gaps: Neural Architectures as Models of
Morphological Rule Learning
- Authors: Deniz Beser
- Abstract summary: We evaluate the Transformer as a model of morphological rule learning.
We compare it with Recurrent Neural Networks (RNN) on English, German, and Russian.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in neural architectures have revived the problem of
morphological rule learning. We evaluate the Transformer as a model of
morphological rule learning and compare it with Recurrent Neural Networks (RNN)
on English, German, and Russian. We bring to the fore a hitherto overlooked
problem, the morphological gaps, where the expected inflection of a word is
missing. For example, 63 Russian verbs lack a first-person-singular present
form such that one cannot comfortably say "*o\v{s}\v{c}u\v{s}\v{c}u" ("I
feel"). Even English has gaps, such as the past participle of "stride": the
function of morphological inflection can be partial. Both neural architectures
produce inflections that ought to be missing. Analyses reveal that Transformers
recapitulate the statistical distribution of inflections in the training data,
similar to RNNs. Models' success on English and German is driven by the fact
that rules in these languages can be identified with the majority forms, which
is not universal.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
We train and evaluate neural networks directly as binary classifiers of strings.
We provide results on a variety of languages across the Chomsky hierarchy for three neural architectures.
Our contributions will facilitate theoretically sound empirical testing of language recognition claims in future work.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages [78.1866280652834]
Large language models (LM) are distributions over strings.
We investigate the learnability of regular LMs (RLMs) by RNN and Transformer LMs.
We find that the complexity of the RLM rank is strong and significant predictors of learnability for both RNNs and Transformers.
arXiv Detail & Related papers (2024-06-06T17:34:24Z) - Why can neural language models solve next-word prediction? A
mathematical perspective [53.807657273043446]
We study a class of formal languages that can be used to model real-world examples of English sentences.
Our proof highlights the different roles of the embedding layer and the fully connected component within the neural language model.
arXiv Detail & Related papers (2023-06-20T10:41:23Z) - How do we get there? Evaluating transformer neural networks as cognitive
models for English past tense inflection [0.0]
We train a set of transformer models with different settings to examine their behavior on this task.
The models' performance on the regulars is heavily affected by type frequency and ratio but not token frequency and ratio, and vice versa for the irregulars.
Although the transformer model exhibits some level of learning on the abstract category of verb regularity, its performance does not fit human data well.
arXiv Detail & Related papers (2022-10-17T15:13:35Z) - Is neural language acquisition similar to natural? A chronological
probing study [0.0515648410037406]
We present the chronological probing study of transformer English models such as MultiBERT and T5.
We compare the information about the language learned by the models in the process of training on corpora.
The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language.
arXiv Detail & Related papers (2022-07-01T17:24:11Z) - Modeling Target-Side Morphology in Neural Machine Translation: A
Comparison of Strategies [72.56158036639707]
Morphologically rich languages pose difficulties to machine translation.
A large amount of differently inflected word surface forms entails a larger vocabulary.
Some inflected forms of infrequent terms typically do not appear in the training corpus.
Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence.
arXiv Detail & Related papers (2022-03-25T10:13:20Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Linguistically inspired morphological inflection with a sequence to
sequence model [19.892441884896893]
Our research question is whether a neural network would be capable of learning inflectional morphemes for inflection production.
We are using an inflectional corpus and a single layer seq2seq model to test this hypothesis.
Our character-morpheme-based model creates inflection by predicting the stem character-to-character and the inflectional affixes as character blocks.
arXiv Detail & Related papers (2020-09-04T08:58:42Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - Inflecting when there's no majority: Limitations of encoder-decoder
neural networks as cognitive models for German plurals [27.002788405625484]
Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do?
We collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model.
We conclude that modern neural models may still struggle with minority-class generalization.
arXiv Detail & Related papers (2020-05-18T15:58:28Z) - A Simple Joint Model for Improved Contextual Neural Lemmatization [60.802451210656805]
We present a simple joint neural model for lemmatization and morphological tagging that achieves state-of-the-art results on 20 languages.
Our paper describes the model in addition to training and decoding procedures.
arXiv Detail & Related papers (2019-04-04T02:03:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.