Linguistically inspired morphological inflection with a sequence to
sequence model
- URL: http://arxiv.org/abs/2009.02073v1
- Date: Fri, 4 Sep 2020 08:58:42 GMT
- Title: Linguistically inspired morphological inflection with a sequence to
sequence model
- Authors: Eleni Metheniti, Guenter Neumann, Josef van Genabith
- Abstract summary: Our research question is whether a neural network would be capable of learning inflectional morphemes for inflection production.
We are using an inflectional corpus and a single layer seq2seq model to test this hypothesis.
Our character-morpheme-based model creates inflection by predicting the stem character-to-character and the inflectional affixes as character blocks.
- Score: 19.892441884896893
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inflection is an essential part of every human language's morphology, yet
little effort has been made to unify linguistic theory and computational
methods in recent years. Methods of string manipulation are used to infer
inflectional changes; our research question is whether a neural network would
be capable of learning inflectional morphemes for inflection production in a
similar way to a human in early stages of language acquisition. We are using an
inflectional corpus (Metheniti and Neumann, 2020) and a single layer seq2seq
model to test this hypothesis, in which the inflectional affixes are learned
and predicted as a block and the word stem is modelled as a character sequence
to account for infixation. Our character-morpheme-based model creates
inflection by predicting the stem character-to-character and the inflectional
affixes as character blocks. We conducted three experiments on creating an
inflected form of a word given the lemma and a set of input and target
features, comparing our architecture to a mainstream character-based model with
the same hyperparameters, training and test sets. Overall for 17 languages, we
noticed small improvements on inflecting known lemmas (+0.68%) but steadily
better performance of our model in predicting inflected forms of unknown words
(+3.7%) and small improvements on predicting in a low-resource scenario
(+1.09%)
Related papers
- Reverse-Engineering the Reader [43.26660964074272]
We introduce a novel alignment technique in which we fine-tune a language model to implicitly optimize the parameters of a linear regressor.
Using words as a test case, we evaluate our technique across multiple model sizes and datasets.
We find an inverse relationship between psychometric power and a model's performance on downstream NLP tasks as well as its perplexity on held-out test data.
arXiv Detail & Related papers (2024-10-16T23:05:01Z) - On the Proper Treatment of Tokenization in Psycholinguistics [53.960910019072436]
The paper argues that token-level language models should be marginalized into character-level language models before they are used in psycholinguistic studies.
We find various focal areas whose surprisal is a better psychometric predictor than the surprisal of the region of interest itself.
arXiv Detail & Related papers (2024-10-03T17:18:03Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Transparency at the Source: Evaluating and Interpreting Language Models
With Access to the True Distribution [4.01799362940916]
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data.
The data is generated using a massive probabilistic grammar, that is itself derived from a large natural language corpus.
With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.
arXiv Detail & Related papers (2023-10-23T12:03:01Z) - Morphological Inflection with Phonological Features [7.245355976804435]
This work explores effects on performance obtained through various ways in which morphological models get access to subcharacter phonological features.
We elicit phonemic data from standard graphemic data using language-specific grammars for languages with shallow grapheme-to-phoneme mapping.
arXiv Detail & Related papers (2023-06-21T21:34:39Z) - Probing for Incremental Parse States in Autoregressive Language Models [9.166953511173903]
Next-word predictions from autoregressive neural language models show remarkable sensitivity to syntax.
This work evaluates the extent to which this behavior arises as a result of a learned ability to maintain implicit representations of incremental syntactic structures.
arXiv Detail & Related papers (2022-11-17T18:15:31Z) - Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations.
We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property.
For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.