Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition
- URL: http://arxiv.org/abs/2103.09072v1
- Date: Tue, 16 Mar 2021 13:50:24 GMT
- Title: Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition
- Authors: Jonas Gonzalez-Billandon, Giulia Belgiovine, Alessandra Sciutti,
Giulio Sandini, Francesco Rea
- Abstract summary: The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions.
Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task.
One solution is to make robots learn from their first-hand sensory data with self-supervision.
- Score: 54.749127627191655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to recognize human partners is an important social skill to build
personalized and long-term human-robot interactions, especially in scenarios
like education, care-giving, and rehabilitation. Faces and voices constitute
two important sources of information to enable artificial systems to reliably
recognize individuals. Deep learning networks have achieved state-of-the-art
results and demonstrated to be suitable tools to address such a task. However,
when those networks are applied to different and unprecedented scenarios not
included in the training set, they can suffer a drop in performance. For
example, with robotic platforms in ever-changing and realistic environments,
where always new sensory evidence is acquired, the performance of those models
degrades. One solution is to make robots learn from their first-hand sensory
data with self-supervision. This allows coping with the inherent variability of
the data gathered in realistic and interactive contexts. To this aim, we
propose a cognitive architecture integrating low-level perceptual processes
with a spatial working memory mechanism. The architecture autonomously
organizes the robot's sensory experience into a structured dataset suitable for
human recognition. Our results demonstrate the effectiveness of our
architecture and show that it is a promising solution in the quest of making
robots more autonomous in their learning process.
Related papers
- Interactive Continual Learning Architecture for Long-Term
Personalization of Home Service Robots [11.648129262452116]
We develop a novel interactive continual learning architecture for continual learning of semantic knowledge in a home environment through human-robot interaction.
The architecture builds on core cognitive principles of learning and memory for efficient and real-time learning of new knowledge from humans.
arXiv Detail & Related papers (2024-03-06T04:55:39Z) - Teaching Unknown Objects by Leveraging Human Gaze and Augmented Reality
in Human-Robot Interaction [3.1473798197405953]
This dissertation aims to teach a robot unknown objects in the context of Human-Robot Interaction (HRI)
The combination of eye tracking and Augmented Reality created a powerful synergy that empowered the human teacher to communicate with the robot.
The robot's object detection capabilities exhibited comparable performance to state-of-the-art object detectors trained on extensive datasets.
arXiv Detail & Related papers (2023-12-12T11:34:43Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - CASPER: Cognitive Architecture for Social Perception and Engagement in
Robots [0.5918643136095765]
We present CASPER: a symbolic cognitive architecture that uses qualitative spatial reasoning to anticipate the pursued goal of another agent and to calculate the best collaborative behavior.
We have tested this architecture in a simulated kitchen environment and the results we have collected show that the robot is able to both recognize an ongoing goal and to properly collaborate towards its achievement.
arXiv Detail & Related papers (2022-09-01T10:15:03Z) - Data-driven emotional body language generation for social robotics [58.88028813371423]
In social robotics, endowing humanoid robots with the ability to generate bodily expressions of affect can improve human-robot interaction and collaboration.
We implement a deep learning data-driven framework that learns from a few hand-designed robotic bodily expressions.
The evaluation study found that the anthropomorphism and animacy of the generated expressions are not perceived differently from the hand-designed ones.
arXiv Detail & Related papers (2022-05-02T09:21:39Z) - Low Dimensional State Representation Learning with Robotics Priors in
Continuous Action Spaces [8.692025477306212]
Reinforcement learning algorithms have proven to be capable of solving complicated robotics tasks in an end-to-end fashion.
We propose a framework combining the learning of a low-dimensional state representation, from high-dimensional observations coming from the robot's raw sensory readings, with the learning of the optimal policy.
arXiv Detail & Related papers (2021-07-04T15:42:01Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.