Bidirectional Progressive Neural Networks with Episodic Return Progress
for Emergent Task Sequencing and Robotic Skill Transfer
- URL: http://arxiv.org/abs/2403.04001v1
- Date: Wed, 6 Mar 2024 19:17:49 GMT
- Title: Bidirectional Progressive Neural Networks with Episodic Return Progress
for Emergent Task Sequencing and Robotic Skill Transfer
- Authors: Suzan Ece Ada, Hanne Say, Emre Ugur, Erhan Oztop
- Abstract summary: We introduce a novel multi-task reinforcement learning framework named Episodic Return Progress with Bidirectional Progressive Neural Networks (ERP-BPNN)
The proposed ERP-BPNN model learns in a human-like interleaved manner by (2) autonomous task switching based on a novel intrinsic motivation signal.
We show that ERP-BPNN achieves faster cumulative convergence and improves performance in all metrics considered among morphologically different robots compared to the baselines.
- Score: 1.7205106391379026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human brain and behavior provide a rich venue that can inspire novel control
and learning methods for robotics. In an attempt to exemplify such a
development by inspiring how humans acquire knowledge and transfer skills among
tasks, we introduce a novel multi-task reinforcement learning framework named
Episodic Return Progress with Bidirectional Progressive Neural Networks
(ERP-BPNN). The proposed ERP-BPNN model (1) learns in a human-like interleaved
manner by (2) autonomous task switching based on a novel intrinsic motivation
signal and, in contrast to existing methods, (3) allows bidirectional skill
transfer among tasks. ERP-BPNN is a general architecture applicable to several
multi-task learning settings; in this paper, we present the details of its
neural architecture and show its ability to enable effective learning and skill
transfer among morphologically different robots in a reaching task. The
developed Bidirectional Progressive Neural Network (BPNN) architecture enables
bidirectional skill transfer without requiring incremental training and
seamlessly integrates with online task arbitration. The task arbitration
mechanism developed is based on soft Episodic Return progress (ERP), a novel
intrinsic motivation (IM) signal. To evaluate our method, we use quantifiable
robotics metrics such as 'expected distance to goal' and 'path straightness' in
addition to the usual reward-based measure of episodic return common in
reinforcement learning. With simulation experiments, we show that ERP-BPNN
achieves faster cumulative convergence and improves performance in all metrics
considered among morphologically different robots compared to the baselines.
Related papers
- Enhancing learning in spiking neural networks through neuronal heterogeneity and neuromodulatory signaling [52.06722364186432]
We propose a biologically-informed framework for enhancing artificial neural networks (ANNs)
Our proposed dual-framework approach highlights the potential of spiking neural networks (SNNs) for emulating diverse spiking behaviors.
We outline how the proposed approach integrates brain-inspired compartmental models and task-driven SNNs, bioinspiration and complexity.
arXiv Detail & Related papers (2024-07-05T14:11:28Z) - A Survey on Vision-Language-Action Models for Embodied AI [71.16123093739932]
Vision-language-action models (VLAs) have become a foundational element in robot learning.
Various methods have been proposed to enhance traits such as versatility, dexterity, and generalizability.
VLAs serve as high-level task planners capable of decomposing long-horizon tasks into executable subtasks.
arXiv Detail & Related papers (2024-05-23T01:43:54Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Neural Routing in Meta Learning [9.070747377130472]
We aim to improve the model performance of the current meta learning algorithms by selectively using only parts of the model conditioned on the input tasks.
In this work, we describe an approach that investigates task-dependent dynamic neuron selection in deep convolutional neural networks (CNNs) by leveraging the scaling factor in the batch normalization layer.
We find that the proposed approach, neural routing in meta learning (NRML), outperforms one of the well-known existing meta learning baselines on few-shot classification tasks.
arXiv Detail & Related papers (2022-10-14T16:31:24Z) - The Spike Gating Flow: A Hierarchical Structure Based Spiking Neural
Network for Online Gesture Recognition [12.866549161582412]
We develop a novel brain-inspired Spiking Neural Network (SNN) based system titled Spiking Gating Flow (SGF) for online action learning.
To the best of our knowledge, this is the highest accuracy among the non-backpropagation algorithm based SNNs.
arXiv Detail & Related papers (2022-06-04T04:37:56Z) - Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in
Dynamic Environments [0.5277756703318046]
Key challenge for AI is to build embodied systems that operate in dynamically changing environments.
Standard deep learning systems often struggle in dynamic scenarios.
In this article we investigate biologically inspired architectures as solutions.
arXiv Detail & Related papers (2021-12-31T19:52:42Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Recurrent Neural Network Learning of Performance and Intrinsic
Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics.
We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model.
Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z) - Indirect and Direct Training of Spiking Neural Networks for End-to-End
Control of a Lane-Keeping Vehicle [12.137685936113384]
Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing.
In this paper, we introduce both indirect and direct end-to-end training methods of SNNs for a lane-keeping vehicle.
arXiv Detail & Related papers (2020-03-10T09:35:46Z) - On Simple Reactive Neural Networks for Behaviour-Based Reinforcement
Learning [5.482532589225552]
We present a behaviour-based reinforcement learning approach, inspired by Brook's subsumption architecture.
Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer.
Our approach learns the pick and place task in 8,000 episodes, which represents a drastic reduction in the number of training episodes required by an end-to-end approach.
arXiv Detail & Related papers (2020-01-22T11:49:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.