Related papers: Continual Learning from Synthetic Data for a Humanoid Exercise Robot

Continual Learning from Synthetic Data for a Humanoid Exercise Robot

URL: http://arxiv.org/abs/2102.10034v1
Date: Fri, 19 Feb 2021 17:05:25 GMT
Title: Continual Learning from Synthetic Data for a Humanoid Exercise Robot
Authors: Nicolas Duczek, Matthias Kerzel, Stefan Wermter
Abstract summary: In a practical scenario, a physical exercise is performed by an expert like a physiotherapist and then used as a reference for a humanoid robot like Pepper to give feedback on a patient's execution of the same exercise. This paper tackles the first challenge by designing an architecture that allows for tolerances in translation and rotations regarding the center of the field of view. For the second challenge, we allow the GWR to grow online on incremental data. For evaluation, we created a novel exercise dataset with virtual avatars called the Virtual-Squat dataset.
Score: 15.297262564198972
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In order to detect and correct physical exercises, a Grow-When-Required Network (GWR) with recurrent connections, episodic memory and a novel subnode mechanism is developed in order to learn spatiotemporal relationships of body movements and poses. Once an exercise is performed, the information of pose and movement per frame is stored in the GWR. For every frame, the current pose and motion pair is compared against a predicted output of the GWR, allowing for feedback not only on the pose but also on the velocity of the motion. In a practical scenario, a physical exercise is performed by an expert like a physiotherapist and then used as a reference for a humanoid robot like Pepper to give feedback on a patient's execution of the same exercise. This approach, however, comes with two challenges. First, the distance from the humanoid robot and the position of the user in the camera's view of the humanoid robot have to be considered by the GWR as well, requiring a robustness against the user's positioning in the field of view of the humanoid robot. Second, since both the pose and motion are dependent on the body measurements of the original performer, the expert's exercise cannot be easily used as a reference. This paper tackles the first challenge by designing an architecture that allows for tolerances in translation and rotations regarding the center of the field of view. For the second challenge, we allow the GWR to grow online on incremental data. For evaluation, we created a novel exercise dataset with virtual avatars called the Virtual-Squat dataset. Overall, we claim that our novel architecture based on the GWR can use a learned exercise reference for different body variations through continual online learning, while preventing catastrophic forgetting, enabling for an engaging long-term human-robot interaction with a humanoid robot.

Related papers

Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation [88.83749146867665]
Existing approaches learn a policy to predict a distant next-best end-effector pose. They then compute the corresponding joint rotation angles for motion using inverse kinematics. We propose Kinematics enhanced Spatial-TemporAl gRaph diffuser.
arXiv Detail & Related papers (2025-03-13T17:48:35Z)
Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration. In this work, we tackle the task of reconstructing closely interactive humans from a monocular video. We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z)
Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction [9.806227900768926]
We propose to model social motion forecasting in a shared human-robot representation space. ECHO operates in the aforementioned shared space to predict the future motions of the agents encountered in social scenarios. We evaluate our model in multi-person and human-robot motion forecasting tasks and obtain state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2024-02-07T11:37:14Z)
CARPE-ID: Continuously Adaptable Re-identification for Personalized Robot Assistance [16.948256303861022]
In today's Human-Robot Interaction (HRI) scenarios, a prevailing tendency exists to assume that the robot shall cooperate with the closest individual. We propose a person re-identification module based on continual visual adaptation techniques. We test the framework singularly using recorded videos in a laboratory environment and an HRI scenario by a mobile robot.
arXiv Detail & Related papers (2023-10-30T10:24:21Z)
ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space [9.806227900768926]
This paper introduces a novel deep-learning approach for human-to-robot motion. Our method does not require paired human-to-robot data, which facilitates its translation to new robots. Our model outperforms existing works regarding human-to-robot similarity in terms of efficiency and precision.
arXiv Detail & Related papers (2023-09-11T08:55:04Z)
Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics. Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens. We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z)
Skeleton2Humanoid: Animating Simulated Characters for Physically-plausible Motion In-betweening [59.88594294676711]
Modern deep learning based motion synthesis approaches barely consider the physical plausibility of synthesized motions. We propose a system Skeleton2Humanoid'' which performs physics-oriented motion correction at test time. Experiments on the challenging LaFAN1 dataset show our system can outperform prior methods significantly in terms of both physical plausibility and accuracy.
arXiv Detail & Related papers (2022-10-09T16:15:34Z)
QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars [80.05743236282564]
Real-time tracking of human body motion is crucial for immersive experiences in AR/VR. We present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers. We show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
arXiv Detail & Related papers (2022-09-20T00:25:54Z)
Occlusion-Robust Multi-Sensory Posture Estimation in Physical Human-Robot Interaction [10.063075560468798]
2D postures from OpenPose over a single camera, and the trajectory of the interacting robot while the human performs a task. We use 2D postures from OpenPose over a single camera, and the trajectory of the interacting robot while the human performs a task. We show that our multi-sensory system resolves human kinematic redundancy better than posture estimation solely using OpenPose or posture estimation solely using the robot's trajectory.
arXiv Detail & Related papers (2022-08-12T20:41:09Z)
GIMO: Gaze-Informed Human Motion Prediction in Context [75.52839760700833]
We propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, and ego-centric views with eye gaze. Our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects. To realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches.
arXiv Detail & Related papers (2022-04-20T13:17:39Z)
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks. To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame. Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z)
Few-Shot Visual Grounding for Natural Human-Robot Interaction [0.0]
We propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user. At the core of our system, we employ a multi-modal deep neural network for visual grounding. We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets.
arXiv Detail & Related papers (2021-03-17T15:24:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.