Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework
- URL: http://arxiv.org/abs/2207.01955v5
- Date: Fri, 24 May 2024 08:05:29 GMT
- Title: Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework
- Authors: Shunyu Liu, Kaixuan Chen, Na Yu, Jie Song, Zunlei Feng, Mingli Song,
- Abstract summary: We introduce a novel initiative advisor-in-the-loop actor-critic framework, termed as Ask-AC.
At the heart of Ask-AC are two complementary components, namely action requester and adaptive state selector.
Experimental results on both stationary and non-stationary environments demonstrate that the proposed framework significantly improves the learning efficiency of the agent.
- Score: 41.04606578479283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the promising results achieved, state-of-the-art interactive reinforcement learning schemes rely on passively receiving supervision signals from advisor experts, in the form of either continuous monitoring or pre-defined rules, which inevitably result in a cumbersome and expensive learning process. In this paper, we introduce a novel initiative advisor-in-the-loop actor-critic framework, termed as Ask-AC, that replaces the unilateral advisor-guidance mechanism with a bidirectional learner-initiative one, and thereby enables a customized and efficacious message exchange between learner and advisor. At the heart of Ask-AC are two complementary components, namely action requester and adaptive state selector, that can be readily incorporated into various discrete actor-critic architectures. The former component allows the agent to initiatively seek advisor intervention in the presence of uncertain states, while the latter identifies the unstable states potentially missed by the former especially when environment changes, and then learns to promote the ask action on such states. Experimental results on both stationary and non-stationary environments and across different actor-critic backbones demonstrate that the proposed framework significantly improves the learning efficiency of the agent, and achieves the performances on par with those obtained by continuous advisor monitoring.
Related papers
- RILe: Reinforced Imitation Learning [60.63173816209543]
RILe is a novel trainer-student system that learns a dynamic reward function based on the student's performance and alignment with expert demonstrations.
RILe enables better performance in complex settings where traditional methods falter, outperforming existing methods by 2x in complex simulated robot-locomotion tasks.
arXiv Detail & Related papers (2024-06-12T17:56:31Z) - BRNES: Enabling Security and Privacy-aware Experience Sharing in
Multiagent Robotic and Autonomous Systems [0.15749416770494704]
We propose a novel MARL framework (BRNES) that selects a dynamic neighbor zone for each advisee at each learning step.
Our experiments show that our framework outperforms the state-of-the-art in terms of the steps to goal, obtained reward, and time to goal metrics.
arXiv Detail & Related papers (2023-08-02T16:57:19Z) - Broad-persistent Advice for Interactive Reinforcement Learning Scenarios [2.0549239024359762]
We present a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state.
Results obtained show that the use of broad-persistent advice substantially improves the performance of the agent.
arXiv Detail & Related papers (2022-10-11T06:46:27Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Persistent Rule-based Interactive Reinforcement Learning [0.5999777817331317]
Current interactive reinforcement learning research has been limited to interactions that offer relevant advice to the current state only.
We propose a persistent rule-based interactive reinforcement learning approach, i.e., a method for retaining and reusing provided knowledge.
Our experimental results show persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer.
arXiv Detail & Related papers (2021-02-04T06:48:57Z) - Language-guided Navigation via Cross-Modal Grounding and Alternate
Adversarial Learning [66.9937776799536]
The emerging vision-and-language navigation (VLN) problem aims at learning to navigate an agent to the target location in unseen photo-realistic environments.
The main challenges of VLN arise mainly from two aspects: first, the agent needs to attend to the meaningful paragraphs of the language instruction corresponding to the dynamically-varying visual environments.
We propose a cross-modal grounding module to equip the agent with a better ability to track the correspondence between the textual and visual modalities.
arXiv Detail & Related papers (2020-11-22T09:13:46Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.