What Artificial Neural Networks Can Tell Us About Human Language
Acquisition
- URL: http://arxiv.org/abs/2208.07998v2
- Date: Sun, 11 Feb 2024 21:24:26 GMT
- Title: What Artificial Neural Networks Can Tell Us About Human Language
Acquisition
- Authors: Alex Warstadt and Samuel R. Bowman
- Abstract summary: Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language.
To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans.
- Score: 47.761188531404066
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Rapid progress in machine learning for natural language processing has the
potential to transform debates about how humans learn language. However, the
learning environments and biases of current artificial learners and humans
diverge in ways that weaken the impact of the evidence obtained from learning
simulations. For example, today's most effective neural language models are
trained on roughly one thousand times the amount of linguistic data available
to a typical child. To increase the relevance of learnability results from
computational models, we need to train model learners without significant
advantages over humans. If an appropriate model successfully acquires some
target linguistic knowledge, it can provide a proof of concept that the target
is learnable in a hypothesized human learning scenario. Plausible model
learners will enable us to carry out experimental manipulations to make causal
inferences about variables in the learning environment, and to rigorously test
poverty-of-the-stimulus-style claims arguing for innate linguistic knowledge in
humans on the basis of speculations about learnability. Comparable experiments
will never be possible with human subjects due to practical and ethical
considerations, making model learners an indispensable resource. So far,
attempts to deprive current models of unfair advantages obtain sub-human
results for key grammatical behaviors such as acceptability judgments. But
before we can justifiably conclude that language learning requires more prior
domain-specific knowledge than current models possess, we must first explore
non-linguistic inputs in the form of multimodal stimuli and multi-agent
interaction as ways to make our learners more efficient at learning from
limited linguistic input.
Related papers
- Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations [15.394018604836774]
We introduce a trial-and-demonstration (TnD) learning framework that incorporates three components: student trials, teacher demonstrations, and a reward conditioned on language competence.
Our experiments reveal that the TnD approach accelerates word acquisition for student models of equal or smaller numbers of parameters.
Our findings suggest that interactive language learning, with teacher demonstrations and student trials, can facilitate efficient word learning in language models.
arXiv Detail & Related papers (2024-05-22T16:57:02Z) - Unveiling the pressures underlying language learning and use in neural networks, large language models, and humans: Lessons from emergent machine-to-machine communication [5.371337604556311]
We review three cases where mismatches between the emergent linguistic behavior of neural agents and humans were resolved.
We identify key pressures at play for language learning and emergence: communicative success, production effort, learnability, and other psycho-/sociolinguistic factors.
arXiv Detail & Related papers (2024-03-21T14:33:34Z) - Visual Grounding Helps Learn Word Meanings in Low-Data Regimes [47.7950860342515]
Modern neural language models (LMs) are powerful tools for modeling human sentence production and comprehension.
But to achieve these results, LMs must be trained in distinctly un-human-like ways.
Do models trained more naturalistically -- with grounded supervision -- exhibit more humanlike language learning?
We investigate this question in the context of word learning, a key sub-task in language acquisition.
arXiv Detail & Related papers (2023-10-20T03:33:36Z) - Modeling rapid language learning by distilling Bayesian priors into
artificial neural networks [18.752638142258668]
We show that learning from limited naturalistic data is possible with an approach that combines the strong inductive biases of a Bayesian model with the flexible representations of a neural network.
The resulting system can learn formal linguistic patterns from a small number of examples.
It can also learn aspects of English syntax from a corpus of natural language.
arXiv Detail & Related papers (2023-05-24T04:11:59Z) - Chain of Hindsight Aligns Language Models with Feedback [62.68665658130472]
We propose a novel technique, Chain of Hindsight, that is easy to optimize and can learn from any form of feedback, regardless of its polarity.
We convert all types of feedback into sequences of sentences, which are then used to fine-tune the model.
By doing so, the model is trained to generate outputs based on feedback, while learning to identify and correct negative attributes or errors.
arXiv Detail & Related papers (2023-02-06T10:28:16Z) - Communication Drives the Emergence of Language Universals in Neural
Agents: Evidence from the Word-order/Case-marking Trade-off [3.631024220680066]
We propose a new Neural-agent Language Learning and Communication framework (NeLLCom) where pairs of speaking and listening agents first learn a miniature language.
We succeed in replicating the trade-off with the new framework without hard-coding specific biases in the agents.
arXiv Detail & Related papers (2023-01-30T17:22:33Z) - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [119.29555551279155]
Large language models can encode a wealth of semantic knowledge about the world.
Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language.
We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions.
arXiv Detail & Related papers (2022-04-04T17:57:11Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.