Crossmodal Language Grounding in an Embodied Neurocognitive Model
- URL: http://arxiv.org/abs/2006.13546v2
- Date: Fri, 16 Oct 2020 08:27:34 GMT
- Title: Crossmodal Language Grounding in an Embodied Neurocognitive Model
- Authors: Stefan Heinrich, Yuan Yao, Tobias Hinz, Zhiyuan Liu, Thomas Hummel,
Matthias Kerzel, Cornelius Weber, and Stefan Wermter
- Abstract summary: Human infants are able to acquire natural language seemingly easily at an early age.
From a neuroscientific perspective, natural language is embodied, grounded in most, if not all, sensory and sensorimotor modalities.
We present a neurocognitive model for language grounding which reflects bio-inspired mechanisms.
- Score: 28.461246169379685
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human infants are able to acquire natural language seemingly easily at an
early age. Their language learning seems to occur simultaneously with learning
other cognitive functions as well as with playful interactions with the
environment and caregivers. From a neuroscientific perspective, natural
language is embodied, grounded in most, if not all, sensory and sensorimotor
modalities, and acquired by means of crossmodal integration. However,
characterising the underlying mechanisms in the brain is difficult and
explaining the grounding of language in crossmodal perception and action
remains challenging. In this paper, we present a neurocognitive model for
language grounding which reflects bio-inspired mechanisms such as an implicit
adaptation of timescales as well as end-to-end multimodal abstraction. It
addresses developmental robotic interaction and extends its learning
capabilities using larger-scale knowledge-based data. In our scenario, we
utilise the humanoid robot NICO in obtaining the EMIL data collection, in which
the cognitive robot interacts with objects in a children's playground
environment while receiving linguistic labels from a caregiver. The model
analysis shows that crossmodally integrated representations are sufficient for
acquiring language merely from sensory input through interaction with objects
in an environment. The representations self-organise hierarchically and embed
temporal and spatial information through composition and decomposition. This
model can also provide the basis for further crossmodal integration of
perceptually grounded cognitive representations.
Related papers
- Lost in Translation: The Algorithmic Gap Between LMs and the Brain [8.799971499357499]
Language Models (LMs) have achieved impressive performance on various linguistic tasks, but their relationship to human language processing in the brain remains unclear.
This paper examines the gaps and overlaps between LMs and the brain at different levels of analysis.
We discuss how insights from neuroscience, such as sparsity, modularity, internal states, and interactive learning, can inform the development of more biologically plausible language models.
arXiv Detail & Related papers (2024-07-05T17:43:16Z) - Language, Environment, and Robotic Navigation [0.0]
We propose a unified framework where language functions as an abstract communicative system and as a grounded representation of perceptual experiences.
Our review of cognitive models of distributional semantics and their application to autonomous agents underscores the transformative potential of language-integrated systems.
arXiv Detail & Related papers (2024-04-03T20:30:38Z) - Development of Compositionality and Generalization through Interactive Learning of Language and Action of Robots [1.7624347338410742]
We propose a brain-inspired neural network model that integrates vision, proprioception, and language into a framework of predictive coding and active inference.
Our results show that generalization in learning to unlearned verb-noun compositions, is significantly enhanced when training variations of task composition are increased.
arXiv Detail & Related papers (2024-03-29T06:22:37Z) - Exploring Spatial Schema Intuitions in Large Language and Vision Models [8.944921398608063]
We investigate whether large language models (LLMs) effectively capture implicit human intuitions about building blocks of language.
Surprisingly, correlations between model outputs and human responses emerge, revealing adaptability without a tangible connection to embodied experiences.
This research contributes to a nuanced understanding of the interplay between language, spatial experiences, and computations made by large language models.
arXiv Detail & Related papers (2024-02-01T19:25:50Z) - MIMo: A Multi-Modal Infant Model for Studying Cognitive Development [3.5009119465343033]
We present MIMo, an open-source infant model for studying early cognitive development through computer simulations.
MIMo perceives its surroundings via binocular vision, a vestibular system, proprioception, and touch perception through a full-body virtual skin.
arXiv Detail & Related papers (2023-12-07T14:21:31Z) - Enabling High-Level Machine Reasoning with Cognitive Neuro-Symbolic
Systems [67.01132165581667]
We propose to enable high-level reasoning in AI systems by integrating cognitive architectures with external neuro-symbolic components.
We illustrate a hybrid framework centered on ACT-R and we discuss the role of generative models in recent and future applications.
arXiv Detail & Related papers (2023-11-13T21:20:17Z) - Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process.
We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z) - Data-driven emotional body language generation for social robotics [58.88028813371423]
In social robotics, endowing humanoid robots with the ability to generate bodily expressions of affect can improve human-robot interaction and collaboration.
We implement a deep learning data-driven framework that learns from a few hand-designed robotic bodily expressions.
The evaluation study found that the anthropomorphism and animacy of the generated expressions are not perceived differently from the hand-designed ones.
arXiv Detail & Related papers (2022-05-02T09:21:39Z) - Emergence of Machine Language: Towards Symbolic Intelligence with Neural
Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation.
By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition [54.749127627191655]
The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions.
Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task.
One solution is to make robots learn from their first-hand sensory data with self-supervision.
arXiv Detail & Related papers (2021-03-16T13:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.