Related papers: Learning to Query Internet Text for Informing Reinforcement Learning Agents

Learning to Query Internet Text for Informing Reinforcement Learning Agents

URL: http://arxiv.org/abs/2205.13079v1
Date: Wed, 25 May 2022 23:07:10 GMT
Title: Learning to Query Internet Text for Informing Reinforcement Learning Agents
Authors: Kolby Nottingham, Alekhya Pyla, Sameer Singh, Roy Fox
Abstract summary: We tackle the problem of extracting useful information from natural language found in the wild. We train reinforcement learning agents to learn to query these sources as a human would. We show that our method correctly learns to execute queries to maximize reward in a reinforcement learning setting.
Score: 36.69880704465014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generalization to out of distribution tasks in reinforcement learning is a challenging problem. One successful approach improves generalization by conditioning policies on task or environment descriptions that provide information about the current transition or reward functions. Previously, these descriptions were often expressed as generated or crowd sourced text. In this work, we begin to tackle the problem of extracting useful information from natural language found in the wild (e.g. internet forums, documentation, and wikis). These natural, pre-existing sources are especially challenging, noisy, and large and present novel challenges compared to previous approaches. We propose to address these challenges by training reinforcement learning agents to learn to query these sources as a human would, and we experiment with how and when an agent should query. To address the \textit{how}, we demonstrate that pretrained QA models perform well at executing zero-shot queries in our target domain. Using information retrieved by a QA model, we train an agent to learn \textit{when} it should execute queries. We show that our method correctly learns to execute queries to maximize reward in a reinforcement learning setting.

Related papers

Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization [56.674356045200696]
We propose a novel method to train AI agents to incorporate knowledge and skills for multiple tasks without the need for cumbersome note systems or prior high-quality demonstration data. Our approach employs an iterative process where the agent collects new experiences, receives corrective feedback from humans in the form of hints, and integrates this feedback into its weights. We demonstrate the efficacy of our approach by implementing it in a Llama-3-based agent which, after only a few rounds of feedback, outperforms advanced models GPT-4o and DeepSeek-V3 in a taskset.
arXiv Detail & Related papers (2025-02-03T17:45:46Z)
Online Continual Learning For Interactive Instruction Following Agents [20.100312650193228]
We argue that such a learning scenario is less realistic since a robotic agent is supposed to learn the world continuously as it explores and perceives it. We propose two continual learning setups for embodied agents; learning new behaviors and new environments.
arXiv Detail & Related papers (2024-03-12T11:33:48Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration [17.27164535440641]
Posterior sampling is a promising approach, but it requires Bayesian inference and dynamic programming. We show that even though partial models exclude relevant information from the environment, they can nevertheless lead to good policies.
arXiv Detail & Related papers (2023-02-08T18:35:24Z)
Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning [0.6445605125467572]
A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task. Transfer learning proposes to address this issue by re-using knowledge from previously learned tasks. The goal of this paper is to address these issues with modular multi-source transfer learning techniques.
arXiv Detail & Related papers (2022-05-28T12:04:52Z)
Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language [121.56329458876655]
We introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld. We propose the "Asking for Knowledge" (AFK) agent, which learns to generate language commands to query for meaningful knowledge.
arXiv Detail & Related papers (2022-05-12T14:20:31Z)
BERTese: Learning to Speak to BERT [50.76152500085082]
We propose a method for automatically rewriting queries into "BERTese", a paraphrase query that is directly optimized towards better knowledge extraction. We empirically show our approach outperforms competing baselines, obviating the need for complex pipelines.
arXiv Detail & Related papers (2021-03-09T10:17:22Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision. Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z)
REALM: Retrieval-Augmented Language Model Pre-Training [37.3178586179607]
We augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA)
arXiv Detail & Related papers (2020-02-10T18:40:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.