Learning to Guide and to Be Guided in the Architect-Builder Problem
- URL: http://arxiv.org/abs/2112.07342v1
- Date: Tue, 14 Dec 2021 12:57:27 GMT
- Title: Learning to Guide and to Be Guided in the Architect-Builder Problem
- Authors: Barde Paul, Karch Tristan, Nowrouzezahrai Derek, Moulin-Frier
Cl\'ement, Pal Christopher, Oudeyer Pierre-Yves
- Abstract summary: We are interested in interactive agents that learn to coordinate, namely, a $builder$ which performs actions but ignores the goal of the task.
We propose Architect-Builder Iterated Guiding (ABIG) as a solution to the Architect-Builder Problem.
ABIG results in a low-level, high-frequency, guiding communication protocol that enables an architect-builder pair to solve the task at hand.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We are interested in interactive agents that learn to coordinate, namely, a
$builder$ -- which performs actions but ignores the goal of the task -- and an
$architect$ which guides the builder towards the goal of the task. We define
and explore a formal setting where artificial agents are equipped with
mechanisms that allow them to simultaneously learn a task while at the same
time evolving a shared communication protocol. The field of Experimental
Semiotics has shown the extent of human proficiency at learning from a priori
unknown instructions meanings. Therefore, we take inspiration from it and
present the Architect-Builder Problem (ABP): an asymmetrical setting in which
an architect must learn to guide a builder towards constructing a specific
structure. The architect knows the target structure but cannot act in the
environment and can only send arbitrary messages to the builder. The builder on
the other hand can act in the environment but has no knowledge about the task
at hand and must learn to solve it relying only on the messages sent by the
architect. Crucially, the meaning of messages is initially not defined nor
shared between the agents but must be negotiated throughout learning. Under
these constraints, we propose Architect-Builder Iterated Guiding (ABIG), a
solution to the Architect-Builder Problem where the architect leverages a
learned model of the builder to guide it while the builder uses self-imitation
learning to reinforce its guided behavior. We analyze the key learning
mechanisms of ABIG and test it in a 2-dimensional instantiation of the ABP
where tasks involve grasping cubes, placing them at a given location, or
building various shapes. In this environment, ABIG results in a low-level,
high-frequency, guiding communication protocol that not only enables an
architect-builder pair to solve the task at hand, but that can also generalize
to unseen tasks.
Related papers
- Embodied Instruction Following in Unknown Environments [66.60163202450954]
We propose an embodied instruction following (EIF) method for complex tasks in the unknown environment.
We build a hierarchical embodied instruction following framework including the high-level task planner and the low-level exploration controller.
For the task planner, we generate the feasible step-by-step plans for human goal accomplishment according to the task completion process and the known visual clues.
arXiv Detail & Related papers (2024-06-17T17:55:40Z) - Building Optimal Neural Architectures using Interpretable Knowledge [15.66288233048004]
AutoBuild is a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-truth performance of the architectures they appear in.
We show that by mining a relatively small set of evaluated architectures, AutoBuild can learn to build high-quality architectures directly or help to reduce search space to focus on relevant areas.
arXiv Detail & Related papers (2024-03-20T04:18:38Z) - Towards an Interpretable Hierarchical Agent Framework using Semantic
Goals [6.677083312952721]
This work introduces an interpretable hierarchical agent framework by combining planning and semantic goal directed reinforcement learning.
We evaluate our framework on a robotic block manipulation task and show that it performs better than other methods.
arXiv Detail & Related papers (2022-10-16T02:04:13Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Autonomous Open-Ended Learning of Tasks with Non-Stationary
Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals.
While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks.
In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture.
Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z) - Learning to Execute Actions or Ask Clarification Questions [9.784428580459776]
We propose a new builder agent model capable of determining when to ask or execute instructions.
Experimental results show that our model achieves state-of-the-art performance on the collaborative building task.
arXiv Detail & Related papers (2022-04-18T15:36:02Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - Provable Hierarchical Lifelong Learning with a Sketch-based Modular
Architecture [28.763868513396705]
We show that our architecture is theoretically able to learn tasks that can be solved by functions that are learnable given access to functions for other, previously learned tasks as subroutines.
We empirically show that some tasks that we can learn in this way are not learned by standard training methods in practice.
arXiv Detail & Related papers (2021-12-21T00:45:03Z) - In a Nutshell, the Human Asked for This: Latent Goals for Following
Temporal Specifications [16.9640514047609]
We address the problem of building agents whose goal is to satisfy out-of distribution (OOD) multi-task instructions expressed in temporal logic (TL)
Recent works provided evidence that the deep learning architecture is a key feature when teaching a DRL agent to solve OOD tasks in TL.
We present a novel deep learning architecture that induces agents to generate latent representations of their current goal given both the human instruction and the current observation from the environment.
arXiv Detail & Related papers (2021-10-18T16:53:31Z) - ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in
Dynamic Environments [85.81157224163876]
We combine Vision-and-Language Navigation, assembling of collected objects, and object referring expression comprehension, to create a novel joint navigation-and-assembly task, named ArraMon.
During this task, the agent is asked to find and collect different target objects one-by-one by navigating based on natural language instructions in a complex, realistic outdoor environment.
We present results for several baseline models (integrated and biased) and metrics (nDTW, CTC, rPOD, and PTC), and the large model-human performance gap demonstrates that our task is challenging and presents a wide scope for future work.
arXiv Detail & Related papers (2020-11-15T23:30:36Z) - CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment.
Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.