Related papers: Analyzing Finnish Inflectional Classes through Discriminative Lexicon and Deep Learning Models

Analyzing Finnish Inflectional Classes through Discriminative Lexicon and Deep Learning Models

URL: http://arxiv.org/abs/2509.04813v1
Date: Fri, 05 Sep 2025 05:24:56 GMT
Title: Analyzing Finnish Inflectional Classes through Discriminative Lexicon and Deep Learning Models
Authors: Alexandre Nikolaev, Yu-Ying Chuang, R. Harald Baayen,
Abstract summary: Inflectional classes bring together nouns which have similar stem changes and use similar exponents in their paradigms.<n>It is unclear whether inflectional classes are cognitively real.<n>This study uses a dataset with 55,271 inflected nouns of 2000 high-frequency Finnish nouns from 49 inflectional classes.
Score: 42.045109659898465
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Descriptions of complex nominal or verbal systems make use of inflectional classes. Inflectional classes bring together nouns which have similar stem changes and use similar exponents in their paradigms. Although inflectional classes can be very useful for language teaching as well as for setting up finite state morphological systems, it is unclear whether inflectional classes are cognitively real, in the sense that native speakers would need to discover these classes in order to learn how to properly inflect the nouns of their language. This study investigates whether the Discriminative Lexicon Model (DLM) can understand and produce Finnish inflected nouns without setting up inflectional classes, using a dataset with 55,271 inflected nouns of 2000 high-frequency Finnish nouns from 49 inflectional classes. Several DLM comprehension and production models were set up. Some models were not informed about frequency of use, and provide insight into learnability with infinite exposure (endstate learning). Other models were set up from a usage based perspective, and were trained with token frequencies being taken into consideration (frequency-informed learning). On training data, models performed with very high accuracies. For held-out test data, accuracies decreased, as expected, but remained acceptable. Across most models, performance increased for inflectional classes with more types, more lower-frequency words, and more hapax legomena, mirroring the productivity of the inflectional classes. The model struggles more with novel forms of unproductive and less productive classes, and performs far better for unseen forms belonging to productive classes. However, for usage-based production models, frequency was the dominant predictor of model performance, and correlations with measures of productivity were tenuous or absent.

Related papers

Learning inflection classes using Adaptive Resonance Theory [0.2676349883103403]
We study the learnability of a system of verbal inflection classes by the individual language user.<n>We use Adaptive Resonance Theory, a neural network with a parameter that determines the degree of generalisation (vigilance)<n>The similarity of clustering to attested inflection classes varies depending on the complexity of the inflectional system.
arXiv Detail & Related papers (2025-12-17T15:58:20Z)
On the Proper Treatment of Tokenization in Psycholinguistics [53.960910019072436]
The paper argues that token-level language models should be marginalized into character-level language models before they are used in psycholinguistic studies.<n>We find various focal areas whose surprisal is a better psychometric predictor than the surprisal of the region of interest itself.
arXiv Detail & Related papers (2024-10-03T17:18:03Z)
FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models [5.748694060126043]
We evaluate four different types of discriminatory practices within visual-language models. We introduce FairPIthera, a method to reduce them by removing the most affected dimensions of feature embeddings. The application of FairPIthera has led to a significant reduction of up to 98% in observed biases.
arXiv Detail & Related papers (2024-09-28T22:49:22Z)
Modeling Orthographic Variation in Occitan's Dialects [3.038642416291856]
Large multilingual models minimize the need for spelling normalization during pre-processing. Our findings suggest that large multilingual models minimize the need for spelling normalization during pre-processing.
arXiv Detail & Related papers (2024-04-30T07:33:51Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Counteracts: Testing Stereotypical Representation in Pre-trained Language Models [4.211128681972148]
We use counterexamples to examine the internal stereotypical knowledge in pre-trained language models (PLMs) We evaluate 7 PLMs on 9 types of cloze-style prompt with different information and base knowledge.
arXiv Detail & Related papers (2023-01-11T07:52:59Z)
Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains. How do language models of different sizes learn during pre-training? Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z)
Lexical Generalization Improves with Larger Models and Longer Training [42.024050065980845]
We analyze the use of lexical overlaps in natural language inference, paraphrase detection, and reading comprehension. We find that larger models are much less susceptible to adopting lexical overlaps.
arXiv Detail & Related papers (2022-10-23T09:20:11Z)
Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations. We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property. For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z)
Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics. Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding. We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z)
Modeling morphology with Linear Discriminative Learning: considerations and design choices [1.3535770763481905]
This study addresses a series of methodological questions that arise when modeling inflectional morphology with Linear Discriminative Learning. We illustrate how decisions made about the representation of form and meaning influence model performance. We discuss how the model can be set up to approximate the learning of inflected words in context.
arXiv Detail & Related papers (2021-06-15T07:37:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.