Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a
Pretrained Language Model
- URL: http://arxiv.org/abs/2302.08150v2
- Date: Tue, 23 May 2023 08:00:08 GMT
- Title: Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a
Pretrained Language Model
- Authors: Jakob Prange and Man Ho Ivy Wong
- Abstract summary: We use both Bayesian and neural models to dissect a data set of Chinese learners' responses to two tests measuring their understanding of English prepositions.
The results mostly replicate previous findings from frequentist analyses and newly reveal crucial interactions between student ability, task type, and stimulus sentence.
- Score: 0.9374453871700481
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We use both Bayesian and neural models to dissect a data set of Chinese
learners' pre- and post-interventional responses to two tests measuring their
understanding of English prepositions. The results mostly replicate previous
findings from frequentist analyses and newly reveal crucial interactions
between student ability, task type, and stimulus sentence. Given the sparsity
of the data as well as high diversity among learners, the Bayesian method
proves most useful; but we also see potential in using language model
probabilities as predictors of grammaticality and learnability.
Related papers
- Personality Prediction from Life Stories using Language Models [12.851871085845499]
In this study, we address the challenge of modeling long narrative interview where each exceeds 2000 tokens so as to predict Five-Factor Model (FFM) personality traits.<n>We propose a two-step approach: first, we extract contextual embeddings using sliding-window fine-tuning of pretrained language models; then, we apply Recurrent Neural Networks (RNNs) with attention mechanisms to integrate long-range dependencies and enhance interpretability.
arXiv Detail & Related papers (2025-06-24T02:39:06Z) - Evaluating Discourse Cohesion in Pre-trained Language Models [42.63411207004852]
We propose a test suite to evaluate the cohesive ability of pre-trained language models.
The test suite contains multiple cohesion phenomena between adjacent and non-adjacent sentences.
arXiv Detail & Related papers (2025-03-08T09:19:53Z) - The Emergence of Grammar through Reinforcement Learning [5.599852485003601]
The evolution of grammatical systems of syntactic and semantic composition is modeled here with a novel application of reinforcement learning theory.
We include within the model a probability distribution over different messages that could be expressed in a given context.
The proposed learning and production algorithm then breaks down language learning into a sequence of simple steps, such as each step benefits from the message probabilities.
arXiv Detail & Related papers (2025-03-03T15:10:46Z) - Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism.
Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z) - Can a Neural Model Guide Fieldwork? A Case Study on Morphological Inflection [3.48094693551887]
Linguistic fieldwork is an important component in language documentation and preservation.
This paper presents a novel model that guides a linguist during the fieldwork and accounts for the dynamics of linguist-speaker interactions.
arXiv Detail & Related papers (2024-09-22T23:40:03Z) - KnowledgeVIS: Interpreting Language Models by Comparing
Fill-in-the-Blank Prompts [12.131691892960502]
We present KnowledgeVis, a human-in-the-loop visual analytics system for interpreting language models.
By comparing predictions between sentences, KnowledgeVis reveals learned associations that intuitively connect what language models learn during training to natural language tasks.
arXiv Detail & Related papers (2024-03-07T18:56:31Z) - Lifelong Learning Natural Language Processing Approach for Multilingual
Data Classification [1.3999481573773074]
We propose a lifelong learning-inspired approach, which allows for fake news detection in multiple languages.
The ability of models to generalize the knowledge acquired between the analyzed languages was also observed.
arXiv Detail & Related papers (2022-05-25T10:34:04Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - From Good to Best: Two-Stage Training for Cross-lingual Machine Reading
Comprehension [51.953428342923885]
We develop a two-stage approach to enhance the model performance.
The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer.
The second stage focuses on precision: an answer-aware contrastive learning mechanism is developed to learn the fine difference between the accurate answer and other candidates.
arXiv Detail & Related papers (2021-12-09T07:31:15Z) - On the Lack of Robust Interpretability of Neural Text Classifiers [14.685352584216757]
We assess the robustness of interpretations of neural text classifiers based on pretrained Transformer encoders.
Both tests show surprising deviations from expected behavior, raising questions about the extent of insights that practitioners may draw from interpretations.
arXiv Detail & Related papers (2021-06-08T18:31:02Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.