Related papers: Interactive Learning from Activity Description

Interactive Learning from Activity Description

URL: http://arxiv.org/abs/2102.07024v1
Date: Sat, 13 Feb 2021 22:51:11 GMT
Title: Interactive Learning from Activity Description
Authors: Khanh Nguyen, Dipendra Misra, Robert Schapire, Miro Dud\'ik, Patrick Shafto
Abstract summary: We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities. Our protocol gives rise to a new family of interactive learning algorithms that offer complementary advantages against traditional algorithms like imitation learning (IL) and reinforcement learning (RL) We develop an algorithm that practically implements this protocol and employ it to train agents in two challenging request-fulfilling problems using purely language-description feedback.
Score: 11.068923430996575
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities. Our protocol gives rise to a new family of interactive learning algorithms that offer complementary advantages against traditional algorithms like imitation learning (IL) and reinforcement learning (RL). We develop an algorithm that practically implements this protocol and employ it to train agents in two challenging request-fulfilling problems using purely language-description feedback. Empirical results demonstrate the strengths of our algorithm: compared to RL baselines, it is more sample-efficient; compared to IL baselines, it achieves competitive success rates while not requiring feedback providers to have agent-specific expertise. We also provide theoretical guarantees of the algorithm under certain assumptions on the teacher and the environment.

Related papers

Online inductive learning from answer sets for efficient reinforcement learning exploration [52.03682298194168]
We exploit inductive learning of answer set programs to learn a set of logical rules representing an explainable approximation of the agent policy. We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch. Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training.
arXiv Detail & Related papers (2025-01-13T16:13:22Z)
Enabling Real-Time Conversations with Minimal Training Costs [61.80370154101649]
This paper presents a new duplex decoding approach that enhances large language models with duplex ability, requiring minimal training. Experimental results indicate that our proposed method significantly enhances the naturalness and human-likeness of user-AI interactions with minimal training costs.
arXiv Detail & Related papers (2024-09-18T06:27:26Z)
Neural Active Learning Beyond Bandits [69.99592173038903]
We study both stream-based and pool-based active learning with neural network approximations. We propose two algorithms based on the newly designed exploitation and exploration neural networks for stream-based and pool-based active learning.
arXiv Detail & Related papers (2024-04-18T21:52:14Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories. We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z)
Inapplicable Actions Learning for Knowledge Transfer in Reinforcement Learning [3.194414753332705]
We show that learning inapplicable actions greatly improves the sample efficiency of RL algorithms. Thanks to the transferability of the knowledge acquired, it can be reused in other tasks and domains to make the learning process more efficient.
arXiv Detail & Related papers (2022-11-28T17:45:39Z)
Batch Active Learning from the Perspective of Sparse Approximation [12.51958241746014]
Active learning enables efficient model training by leveraging interactions between machine learning agents and human annotators. We study and propose a novel framework that formulates batch active learning from the sparse approximation's perspective. Our active learning method aims to find an informative subset from the unlabeled data pool such that the corresponding training loss function approximates its full data pool counterpart.
arXiv Detail & Related papers (2022-11-01T03:20:28Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL) We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions. Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z)
Feedback Coding for Active Learning [15.239252118069762]
We develop an optimal transport-based feedback coding scheme for the task of active example selection. We evaluate APM on a variety of datasets and demonstrate learning performance comparable to existing active learning methods.
arXiv Detail & Related papers (2021-02-28T23:00:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.