Related papers: Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

URL: http://arxiv.org/abs/2002.03647v4
Date: Mon, 3 Aug 2020 11:06:21 GMT
Title: Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Authors: V\'ictor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres
Abstract summary: 'Explore, Discover and Learn' (EDL) is an alternative approach to information-theoretic skill discovery. We show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.
Score: 155.11646755470582
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between option discovery and information theory. Information-theoretic skill discovery methods have garnered much interest from the community, but little research has been conducted in understanding their limitations. Through theoretical analysis and empirical evidence, we show that existing algorithms suffer from a common limitation -- they discover options that provide a poor coverage of the state space. In light of this, we propose 'Explore, Discover and Learn' (EDL), an alternative approach to information-theoretic skill discovery. Crucially, EDL optimizes the same information-theoretic objective derived from the empowerment literature, but addresses the optimization problem using different machinery. We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned. Code is publicly available at https://github.com/victorcampos7/edl.

Related papers

Unsupervised Skill Discovery through Skill Regions Differentiation [6.088346462603191]
Unsupervised Reinforcement Learning (RL) aims to discover diverse behaviors that can accelerate the learning of downstream tasks.<n>We propose a novel skill discovery objective that maximizes the deviation of the state density of one skill from the explored regions of other skills.<n>We also formulate an intrinsic reward based on the learned autoencoder that resembles count-based exploration in a compact latent space.
arXiv Detail & Related papers (2025-06-17T11:30:04Z)
Can Learned Optimization Make Reinforcement Learning Less Difficult? [70.5036361852812]
We consider whether learned optimization can help overcome reinforcement learning difficulties. Our method, Learned Optimization for Plasticity, Exploration and Non-stationarity (OPEN), meta-learns an update rule whose input features and output structure are informed by previously proposed to these difficulties.
arXiv Detail & Related papers (2024-07-09T17:55:23Z)
Collaborative Knowledge Infusion for Low-resource Stance Detection [83.88515573352795]
Target-related knowledge is often needed to assist stance detection models. We propose a collaborative knowledge infusion approach for low-resource stance detection tasks.
arXiv Detail & Related papers (2024-03-28T08:32:14Z)
Fine-Grained Stateful Knowledge Exploration: A Novel Paradigm for Integrating Knowledge Graphs with Large Language Models [19.049828741139425]
Large Language Models (LLMs) have shown impressive capabilities, yet updating their knowledge remains a significant challenge. Most existing methods use a paradigm that treats the question as the objective, with relevant knowledge being incrementally retrieved from the knowledge graph. We propose a novel paradigm of fine-grained stateful knowledge exploration, which addresses the information granularity mismatch' issue.
arXiv Detail & Related papers (2024-01-24T13:36:50Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning [58.107474025048866]
Forgetting refers to the loss or deterioration of previously acquired knowledge. Forgetting is a prevalent phenomenon observed in various other research domains within deep learning.
arXiv Detail & Related papers (2023-07-16T16:27:58Z)
Explainability in reinforcement learning: perspective and position [1.299941371793082]
This paper attempts to give a systematic overview of existing methods in the explainable RL area. It proposes a novel unified taxonomy, building and expanding on the existing ones.
arXiv Detail & Related papers (2022-03-22T09:00:13Z)
Ontology-enhanced Prompt-tuning for Few-shot Learning [41.51144427728086]
Few-shot Learning is aimed to make predictions based on a limited number of samples. Structured data such as knowledge graphs and ontology libraries has been leveraged to benefit the few-shot setting in various tasks.
arXiv Detail & Related papers (2022-01-27T05:41:36Z)
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems. We propose a novel method for computing the normalized maximum likelihood (NML) distribution. We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.