Related papers: Testing learning hypotheses using neural networks by manipulating learning data

Testing learning hypotheses using neural networks by manipulating learning data

URL: http://arxiv.org/abs/2407.04593v1
Date: Fri, 5 Jul 2024 15:41:30 GMT
Title: Testing learning hypotheses using neural networks by manipulating learning data
Authors: Cara Su-Yi Leong, Tal Linzen,
Abstract summary: We show that a neural network language model can learn restrictions to the passive that are similar to those displayed by humans. We find that while the frequency with which a verb appears in the passive significantly affects its passivizability, the semantics of the verb does not.
Score: 20.525923251193472
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although passivization is productive in English, it is not completely general -- some exceptions exist (e.g. *One hour was lasted by the meeting). How do English speakers learn these exceptions to an otherwise general pattern? Using neural network language models as theories of acquisition, we explore the sources of indirect evidence that a learner can leverage to learn whether a verb can passivize. We first characterize English speakers' judgments of exceptions to the passive, confirming that speakers find some verbs more passivizable than others. We then show that a neural network language model can learn restrictions to the passive that are similar to those displayed by humans, suggesting that evidence for these exceptions is available in the linguistic input. We test the causal role of two hypotheses for how the language model learns these restrictions by training models on modified training corpora, which we create by altering the existing training corpora to remove features of the input implicated by each hypothesis. We find that while the frequency with which a verb appears in the passive significantly affects its passivizability, the semantics of the verb does not. This study highlight the utility of altering a language model's training data for answering questions where complete control over a learner's input is vital.

Related papers

Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism. Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z)
Language Models Can Learn Exceptions to Syntactic Rules [22.810889064523167]
We show that artificial neural networks can generalize productively to novel contexts. We also show that the relative acceptability of a verb in the active vs. passive voice is positively correlated with the relative frequency of its occurrence in those voices.
arXiv Detail & Related papers (2023-06-09T15:35:11Z)
Injecting structural hints: Using language models to study inductive biases in language learning [40.8902073270634]
We inject inductive bias into language models by pretraining on formally-structured data. We then evaluate the biased learners' ability to learn typologically-diverse natural languages. We show that non-context-free relationships form the best inductive biases.
arXiv Detail & Related papers (2023-04-25T18:00:08Z)
Learning Cross-lingual Visual Speech Representations [108.68531445641769]
Cross-lingual self-supervised visual representation learning has been a growing research topic in the last few years. We use the recently-proposed Raw Audio-Visual Speechs (RAVEn) framework to pre-train an audio-visual model with unlabelled data. Our experiments show that: (1) multi-lingual models with more data outperform monolingual ones, but, when keeping the amount of data fixed, monolingual models tend to reach better performance.
arXiv Detail & Related papers (2023-03-14T17:05:08Z)
Discovering Latent Knowledge in Language Models Without Supervision [72.95136739040676]
Existing techniques for training language models can be misaligned with the truth. We propose directly finding latent knowledge inside the internal activations of a language model in a purely unsupervised way. We show that despite using no supervision and no model outputs, our method can recover diverse knowledge represented in large language models.
arXiv Detail & Related papers (2022-12-07T18:17:56Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
What Artificial Neural Networks Can Tell Us About Human Language Acquisition [47.761188531404066]
Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language. To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans.
arXiv Detail & Related papers (2022-08-17T00:12:37Z)
Is neural language acquisition similar to natural? A chronological probing study [0.0515648410037406]
We present the chronological probing study of transformer English models such as MultiBERT and T5. We compare the information about the language learned by the models in the process of training on corpora. The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language.
arXiv Detail & Related papers (2022-07-01T17:24:11Z)
Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages. We infer this distribution from a sample of typologically diverse training languages. We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z)
Universal linguistic inductive biases via meta-learning [36.43388942327124]
It is unclear which inductive biases can explain observed patterns in language acquisition. We introduce a framework for giving linguistic inductive biases to a neural network model. We demonstrate this framework with a case study based on syllable structure.
arXiv Detail & Related papers (2020-06-29T19:15:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.