Opening the black box of language acquisition
- URL: http://arxiv.org/abs/2402.11681v1
- Date: Sun, 18 Feb 2024 19:11:58 GMT
- Title: Opening the black box of language acquisition
- Authors: J\'er\^ome Michaud and Anna Jon-and
- Abstract summary: We propose an alternative, more transparent and cognitively plausible architecture for learning language.
Instead of using deep learning, our approach uses a minimal cognitive architecture based on sequence memory and chunking.
Results show that the model can learn these artificial languages from scratch and extract grammatical information that supports learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent advances in large language models using deep learning techniques have
renewed interest on how languages can be learned from data. However, it is
unclear whether or how these models represent grammatical information from the
learned languages. In addition, the models must be pre-trained on large corpora
before they can be used. In this work, we propose an alternative, more
transparent and cognitively plausible architecture for learning language.
Instead of using deep learning, our approach uses a minimal cognitive
architecture based on sequence memory and chunking. The learning mechanism is
based on the principles of reinforcement learning. We test our architecture on
a number of natural-like toy languages. Results show that the model can learn
these artificial languages from scratch and extract grammatical information
that supports learning. Our study demonstrates the power of this simple
architecture and stresses the importance of sequence memory as a key component
of the language learning process. Since other animals do not seem to have a
faithful sequence memory, this may explain why only humans have developed
complex languages.
Related papers
- Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback [14.938401898546553]
We propose to use a semi-structured form to represent reasoning steps of large language models.
Specifically, we use relations, which are not only human but also machine-friendly and easier to verify than natural language.
arXiv Detail & Related papers (2024-06-25T18:21:00Z) - Large Language Models for Scientific Synthesis, Inference and
Explanation [56.41963802804953]
We show how large language models can perform scientific synthesis, inference, and explanation.
We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature.
This approach has the further advantage that the large language model can explain the machine learning system's predictions.
arXiv Detail & Related papers (2023-10-12T02:17:59Z) - Lip Reading for Low-resource Languages by Learning and Combining General
Speech Knowledge and Language-specific Knowledge [57.38948190611797]
This paper proposes a novel lip reading framework, especially for low-resource languages.
Since low-resource languages do not have enough video-text paired data to train the model, it is regarded as challenging to develop lip reading models for low-resource languages.
arXiv Detail & Related papers (2023-08-18T05:19:03Z) - Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism
of Language Models [49.39276272693035]
Large-scale pre-trained language models have shown remarkable memorizing ability.
Vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem.
We find that 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation.
arXiv Detail & Related papers (2023-05-16T03:50:38Z) - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [119.29555551279155]
Large language models can encode a wealth of semantic knowledge about the world.
Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language.
We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions.
arXiv Detail & Related papers (2022-04-04T17:57:11Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Language Models are not Models of Language [0.0]
Transfer learning has enabled large deep learning neural networks trained on the language modeling task to vastly improve performance.
We argue that the term language model is misleading because deep learning models are not theoretical models of language.
arXiv Detail & Related papers (2021-12-13T22:39:46Z) - CoLLIE: Continual Learning of Language Grounding from Language-Image
Embeddings [2.8478710949588284]
CoLLIE is a model for continual learning of how language is grounded in vision.
It learns a transformation function that adjusts the language embeddings when needed to accommodate new language use.
We show that CoLLIE can efficiently learn and generalize from only a few examples.
arXiv Detail & Related papers (2021-11-15T18:54:58Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Automated Source Code Generation and Auto-completion Using Deep
Learning: Comparing and Discussing Current Language-Model-Related Approaches [0.0]
This paper compares different deep learning architectures to create and use language models based on programming code.
We discuss each approach's different strengths and weaknesses and what gaps we find to evaluate the language models or apply them in a real programming context.
arXiv Detail & Related papers (2020-09-16T15:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.