Modeling morphology with Linear Discriminative Learning: considerations
and design choices
- URL: http://arxiv.org/abs/2106.07936v1
- Date: Tue, 15 Jun 2021 07:37:52 GMT
- Title: Modeling morphology with Linear Discriminative Learning: considerations
and design choices
- Authors: Maria Heitmeier, Yu-Ying Chuang, R. Harald Baayen
- Abstract summary: This study addresses a series of methodological questions that arise when modeling inflectional morphology with Linear Discriminative Learning.
We illustrate how decisions made about the representation of form and meaning influence model performance.
We discuss how the model can be set up to approximate the learning of inflected words in context.
- Score: 1.3535770763481905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study addresses a series of methodological questions that arise when
modeling inflectional morphology with Linear Discriminative Learning. Taking
the semi-productive German noun system as example, we illustrate how decisions
made about the representation of form and meaning influence model performance.
We clarify that for modeling frequency effects in learning, it is essential to
make use of incremental learning rather than the endstate of learning. We also
discuss how the model can be set up to approximate the learning of inflected
words in context. In addition, we illustrate how in this approach the wug task
can be modeled in considerable detail. In general, the model provides an
excellent memory for known words, but appropriately shows more limited
performance for unseen data, in line with the semi-productivity of German noun
inflection and generalization performance of native German speakers.
Related papers
- Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - On the Tip of the Tongue: Analyzing Conceptual Representation in Large
Language Models with Reverse-Dictionary Probe [36.65834065044746]
We use in-context learning to guide the models to generate the term for an object concept implied in a linguistic description.
Experiments suggest that conceptual inference ability as probed by the reverse-dictionary task predicts model's general reasoning performance.
arXiv Detail & Related papers (2024-02-22T09:45:26Z) - Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL)
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Feature Interactions Reveal Linguistic Structure in Language Models [2.0178765779788495]
We study feature interactions in the context of feature attribution methods for post-hoc interpretability.
We work out a grey box methodology, in which we train models to perfection on a formal language classification task.
We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model.
arXiv Detail & Related papers (2023-06-21T11:24:41Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics.
Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding.
We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z) - Under the Microscope: Interpreting Readability Assessment Models for
Filipino [0.0]
We dissect machine learning-based readability assessment models in Filipino by performing global and local model interpretation.
Results show that using a model trained with top features from global interpretation obtained higher performance than the ones using features selected by Spearman correlation.
arXiv Detail & Related papers (2021-10-01T01:27:10Z) - Dissecting Generation Modes for Abstractive Summarization Models via
Ablation and Attribution [34.2658286826597]
We propose a two-step method to interpret summarization model decisions.
We first analyze the model's behavior by ablating the full model to categorize each decoder decision into one of several generation modes.
After isolating decisions that do depend on the input, we explore interpreting these decisions using several different attribution methods.
arXiv Detail & Related papers (2021-06-03T00:54:16Z) - CausaLM: Causal Model Explanation Through Counterfactual Language Models [33.29636213961804]
CausaLM is a framework for producing causal model explanations using counterfactual language representation models.
We show that language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest.
A byproduct of our method is a language representation model that is unaffected by the tested concept.
arXiv Detail & Related papers (2020-05-27T15:06:35Z) - Explaining Black Box Predictions and Unveiling Data Artifacts through
Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples.
We conduct a comparison between influence functions and common word-saliency methods on representative tasks.
We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.