Related papers: Learning Compositional Neural Programs for Continuous Control

Learning Compositional Neural Programs for Continuous Control

URL: http://arxiv.org/abs/2007.13363v2
Date: Tue, 13 Apr 2021 12:08:39 GMT
Title: Learning Compositional Neural Programs for Continuous Control
Authors: Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, Karim Beguir, Nando de Freitas
Abstract summary: We propose a novel solution to challenging sparse-reward, continuous control problems. Our solution, dubbed AlphaNPI-X, involves three separate stages of learning. We empirically show that AlphaNPI-X can effectively learn to tackle challenging sparse manipulation tasks.
Score: 62.80551956557359
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel solution to challenging sparse-reward, continuous control problems that require hierarchical planning at multiple levels of abstraction. Our solution, dubbed AlphaNPI-X, involves three separate stages of learning. First, we use off-policy reinforcement learning algorithms with experience replay to learn a set of atomic goal-conditioned policies, which can be easily repurposed for many tasks. Second, we learn self-models describing the effect of the atomic policies on the environment. Third, the self-models are harnessed to learn recursive compositional programs with multiple levels of abstraction. The key insight is that the self-models enable planning by imagination, obviating the need for interaction with the world when learning higher-level compositional programs. To accomplish the third stage of learning, we extend the AlphaNPI algorithm, which applies AlphaZero to learn recursive neural programmer-interpreters. We empirically show that AlphaNPI-X can effectively learn to tackle challenging sparse manipulation tasks, such as stacking multiple blocks, where powerful model-free baselines fail.

Related papers

A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination [11.203441390685201]
Zero-shot coordination (ZSC) remains a major challenge in the cooperative AI field. We introduce Knowledge-driven Programmatic reinforcement learning for ZSC. A significant challenge is the vast program search space, making it difficult to find high-performing programs efficiently.
arXiv Detail & Related papers (2024-08-08T09:43:54Z)
Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle. We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths. Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z)
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning [1.3535770763481902]
We propose a new framework for predictive coding called active predictive coding. It can learn hierarchical world models and solve two radically different open problems in AI.
arXiv Detail & Related papers (2022-10-23T05:44:22Z)
Learning Neuro-Symbolic Skills for Bilevel Planning [63.388694268198655]
Decision-making is challenging in robotics environments with continuous object-centric states, continuous actions, long horizons, and sparse feedback. Hierarchical approaches, such as task and motion planning (TAMP), address these challenges by decomposing decision-making into two or more levels of abstraction. Our main contribution is a method for learning parameterized polices in combination with operators and samplers.
arXiv Detail & Related papers (2022-06-21T19:01:19Z)
Learning to Synthesize Programs as Interpretable and Generalizable Policies [25.258598215642067]
We present a framework that learns to synthesize a program, which details the procedure to solve a task in a flexible and expressive manner. Experimental results demonstrate that the proposed framework not only learns to reliably synthesize task-solving programs but also outperforms DRL and program synthesis baselines.
arXiv Detail & Related papers (2021-08-31T07:03:06Z)
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems. This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z)
Episodic Self-Imitation Learning with Hindsight [7.743320290728377]
Episodic self-imitation learning is a novel self-imitation algorithm with a trajectory selection module and an adaptive loss function. A selection module is introduced to filter uninformative samples from each episode of the update. Episodic self-imitation learning has the potential to be applied to real-world problems that have continuous action spaces.
arXiv Detail & Related papers (2020-11-26T20:36:42Z)
Deep Imitation Learning for Bimanual Robotic Manipulation [70.56142804957187]
We present a deep imitation learning framework for robotic bimanual manipulation. A core challenge is to generalize the manipulation skills to objects in different locations. We propose to (i) decompose the multi-modal dynamics into elemental movement primitives, (ii) parameterize each primitive using a recurrent graph neural network to capture interactions, and (iii) integrate a high-level planner that composes primitives sequentially and a low-level controller to combine primitive dynamics and inverse kinematics control.
arXiv Detail & Related papers (2020-10-11T01:40:03Z)
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms. SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.