Reinforcement learning with human advice: a survey
- URL: http://arxiv.org/abs/2005.11016v2
- Date: Tue, 24 Nov 2020 09:02:59 GMT
- Title: Reinforcement learning with human advice: a survey
- Authors: Anis Najar and Mohamed Chetouani
- Abstract summary: We first propose a taxonomy of the different forms of advice that can be provided to a learning agent.
We then describe the methods that can be used for interpreting advice when its meaning is not determined beforehand.
- Score: 2.66512000865131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we provide an overview of the existing methods for integrating
human advice into a Reinforcement Learning process. We first propose a taxonomy
of the different forms of advice that can be provided to a learning agent. We
then describe the methods that can be used for interpreting advice when its
meaning is not determined beforehand. Finally, we review different approaches
for integrating advice into the learning process.
Related papers
- Opinion-Guided Reinforcement Learning [0.46040036610482665]
We present a method to guide reinforcement learning agents through opinions.
We evaluate it with synthetic (oracle) and human advisors, at different levels of uncertainty.
Our results indicate that opinions, even if uncertain, improve the performance of reinforcement learning agents.
arXiv Detail & Related papers (2024-05-27T15:52:27Z) - LiFT: Unsupervised Reinforcement Learning with Foundation Models as
Teachers [59.69716962256727]
We propose a framework that guides a reinforcement learning agent to acquire semantically meaningful behavior without human feedback.
In our framework, the agent receives task instructions grounded in a training environment from large language models.
We demonstrate that our method can learn semantically meaningful skills in a challenging open-ended MineDojo environment.
arXiv Detail & Related papers (2023-12-14T14:07:41Z) - Advice Conformance Verification by Reinforcement Learning agents for
Human-in-the-Loop [17.042179951736262]
We study two cases of good and bad advice scenarios in MuJoCo's Humanoid environment.
We show that our method can provide an interpretable means of solving the Advice-Conformance Verification problem.
arXiv Detail & Related papers (2022-10-07T10:56:28Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - Measuring "Why" in Recommender Systems: a Comprehensive Survey on the
Evaluation of Explainable Recommendation [87.82664566721917]
This survey is based on more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, Recsys, UMAP, and IUI.
arXiv Detail & Related papers (2022-02-14T02:58:55Z) - Action Advising with Advice Imitation in Deep Reinforcement Learning [0.5185131234265025]
Action advising is a peer-to-peer knowledge exchange technique built on the teacher-student paradigm.
We present an approach to enable the student agent to imitate previously acquired advice to reuse them directly in its exploration policy.
arXiv Detail & Related papers (2021-04-17T04:24:04Z) - KnowledgeCheckR: Intelligent Techniques for Counteracting Forgetting [52.623349754076024]
We provide an overview of the recommendation approaches integrated in KnowledgeCheckR.
Examples thereof are utility-based recommendation that helps to identify learning contents to be repeated in the future, collaborative filtering approaches that help to implement session-based recommendation, and content-based recommendation that supports intelligent question answering.
arXiv Detail & Related papers (2021-02-15T20:06:28Z) - Human Engagement Providing Evaluative and Informative Advice for
Interactive Reinforcement Learning [2.5799044614524664]
This work focuses on answering which of two approaches, evaluative or informative, is the preferred instructional approach for humans.
Results show users giving informative advice provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode.
arXiv Detail & Related papers (2020-09-21T02:14:02Z) - Knowledge Transfer via Pre-training for Recommendation: A Review and
Prospect [89.91745908462417]
We show the benefits of pre-training to recommender systems through experiments.
We discuss several promising directions for future research for recommender systems with pre-training.
arXiv Detail & Related papers (2020-09-19T13:06:27Z) - Reward-Conditioned Policies [100.64167842905069]
imitation learning requires near-optimal expert data.
Can we learn effective policies via supervised learning without demonstrations?
We show how such an approach can be derived as a principled method for policy search.
arXiv Detail & Related papers (2019-12-31T18:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.