CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives
- URL: http://arxiv.org/abs/2411.19787v1
- Date: Fri, 29 Nov 2024 15:49:06 GMT
- Title: CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives
- Authors: Armin Saghafian, Amirmohammad Izadi, Negin Hashemi Dijujin, Mahdieh Soleymani Baghshah,
- Abstract summary: Grounding the instruction in the environment is a key step in solving language-guided goal-reaching reinforcement learning problems.
We propose CAREL as a new framework to solve this problem using auxiliary loss functions inspired by video-text retrieval literature.
Results of our experiments suggest superior sample efficiency and systematic generalization for this framework in multi-modal reinforcement learning problems.
- Score: 2.4392539322920763
- License:
- Abstract: Grounding the instruction in the environment is a key step in solving language-guided goal-reaching reinforcement learning problems. In automated reinforcement learning, a key concern is to enhance the model's ability to generalize across various tasks and environments. In goal-reaching scenarios, the agent must comprehend the different parts of the instructions within the environmental context in order to complete the overall task successfully. In this work, we propose CAREL (Cross-modal Auxiliary REinforcement Learning) as a new framework to solve this problem using auxiliary loss functions inspired by video-text retrieval literature and a novel method called instruction tracking, which automatically keeps track of progress in an environment. The results of our experiments suggest superior sample efficiency and systematic generalization for this framework in multi-modal reinforcement learning problems. Our code base is available here.
Related papers
- Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.
We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.
We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning [9.134178145285693]
ARCLE is an environment designed to facilitate reinforcement learning research on the inductive reasoning benchmark.
We demonstrate that an agent with proximal policy optimization can learn individual tasks through ARCLE.
We propose several research directions and motivations for using ARCLE, including MAML, GFlowNets, and World Models.
arXiv Detail & Related papers (2024-07-30T13:11:45Z) - Exploiting Contextual Structure to Generate Useful Auxiliary Tasks [12.236546713167945]
Reinforcement learning requires interaction with an environment, which is expensive for robots.
We propose an approach that maximizes experience reuse while learning to solve a given task by generating and simultaneously learning useful auxiliary tasks.
arXiv Detail & Related papers (2023-03-09T05:11:30Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - AANG: Automating Auxiliary Learning [110.36191309793135]
We present an approach for automatically generating a suite of auxiliary objectives.
We achieve this by deconstructing existing objectives within a novel unified taxonomy, identifying connections between them, and generating new ones based on the uncovered structure.
This leads us to a principled and efficient algorithm for searching the space of generated objectives to find those most useful to a specified end-task.
arXiv Detail & Related papers (2022-05-27T16:32:28Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning [15.33496710690063]
We propose goal-aware cross-entropy (GACE) loss, that can be utilized in a self-supervised way.
We then devise goal-discriminative attention networks (GDAN) which utilize the goal-relevant information to focus on the given instruction.
arXiv Detail & Related papers (2021-10-25T14:24:39Z) - Attaining Interpretability in Reinforcement Learning via Hierarchical
Primitive Composition [3.1078562713129765]
We propose a novel hierarchical reinforcement learning algorithm that mitigates the aforementioned issues by decomposing the original task in a hierarchy.
We show how the proposed scheme can be employed in practice by solving a pick and place task with a 6 DoF manipulator.
arXiv Detail & Related papers (2021-10-05T05:59:31Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.