Influencing Reinforcement Learning through Natural Language Guidance
- URL: http://arxiv.org/abs/2104.01506v1
- Date: Sun, 4 Apr 2021 00:23:39 GMT
- Title: Influencing Reinforcement Learning through Natural Language Guidance
- Authors: Tasmia Tasrin, Md Sultan AL Nahian, Habarakadage Perera and Brent
Harrison
- Abstract summary: We explore how natural language advice can be used to provide a richer feedback signal to a reinforcement learning agent.
Usually policy shaping employs a human feedback policy to help an agent to learn more about how to achieve its goal.
In our case, we replace this human feedback policy with policy generated based on natural language advice.
- Score: 4.227540427595989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactive reinforcement learning agents use human feedback or instruction
to help them learn in complex environments. Often, this feedback comes in the
form of a discrete signal that is either positive or negative. While
informative, this information can be difficult to generalize on its own. In
this work, we explore how natural language advice can be used to provide a
richer feedback signal to a reinforcement learning agent by extending policy
shaping, a well-known Interactive reinforcement learning technique. Usually
policy shaping employs a human feedback policy to help an agent to learn more
about how to achieve its goal. In our case, we replace this human feedback
policy with policy generated based on natural language advice. We aim to
inspect if the generated natural language reasoning provides support to a deep
reinforcement learning agent to decide its actions successfully in any given
environment. So, we design our model with three networks: first one is the
experience driven, next is the advice generator and third one is the advice
driven. While the experience driven reinforcement learning agent chooses its
actions being influenced by the environmental reward, the advice driven neural
network with generated feedback by the advice generator for any new state
selects its actions to assist the reinforcement learning agent to better policy
shaping.
Related papers
- Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning [21.078032718892498]
We consider the task of building a dialogue system that can motivate users to adopt positive lifestyle changes: Motivational Interviewing.
We propose DIIT, a framework that is capable of learning and applying conversation strategies in the form of natural language inductive rules from expert demonstrations.
arXiv Detail & Related papers (2024-03-23T06:03:37Z) - LiFT: Unsupervised Reinforcement Learning with Foundation Models as
Teachers [59.69716962256727]
We propose a framework that guides a reinforcement learning agent to acquire semantically meaningful behavior without human feedback.
In our framework, the agent receives task instructions grounded in a training environment from large language models.
We demonstrate that our method can learn semantically meaningful skills in a challenging open-ended MineDojo environment.
arXiv Detail & Related papers (2023-12-14T14:07:41Z) - Is Feedback All You Need? Leveraging Natural Language Feedback in
Goal-Conditioned Reinforcement Learning [54.31495290436766]
We extend BabyAI to automatically generate language feedback from environment dynamics and goal condition success.
We modify the Decision Transformer architecture to take advantage of this additional signal.
We find that training with language feedback either in place of or in addition to the return-to-go or goal descriptions improves agents' generalisation performance.
arXiv Detail & Related papers (2023-12-07T22:33:34Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - Generative Adversarial Reward Learning for Generalized Behavior Tendency
Inference [71.11416263370823]
We propose a generative inverse reinforcement learning for user behavioral preference modelling.
Our model can automatically learn the rewards from user's actions based on discriminative actor-critic network and Wasserstein GAN.
arXiv Detail & Related papers (2021-05-03T13:14:25Z) - Generative Inverse Deep Reinforcement Learning for Online Recommendation [62.09946317831129]
We propose a novel inverse reinforcement learning approach, namely InvRec, for online recommendation.
InvRec extracts the reward function from user's behaviors automatically, for online recommendation.
arXiv Detail & Related papers (2020-11-04T12:12:25Z) - Knowledge-guided Deep Reinforcement Learning for Interactive
Recommendation [49.32287384774351]
Interactive recommendation aims to learn from dynamic interactions between items and users to achieve responsiveness and accuracy.
We propose Knowledge-Guided deep Reinforcement learning to harness the advantages of both reinforcement learning and knowledge graphs for interactive recommendation.
arXiv Detail & Related papers (2020-04-17T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.