Sample-Efficient Reinforcement Learning in the Presence of Exogenous
Information
- URL: http://arxiv.org/abs/2206.04282v1
- Date: Thu, 9 Jun 2022 05:19:32 GMT
- Title: Sample-Efficient Reinforcement Learning in the Presence of Exogenous
Information
- Authors: Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy
and John Langford
- Abstract summary: In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand.
We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component.
We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
- Score: 77.19830787312743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world reinforcement learning applications the learner's observation
space is ubiquitously high-dimensional with both relevant and irrelevant
information about the task at hand. Learning from high-dimensional observations
has been the subject of extensive investigation in supervised learning and
statistics (e.g., via sparsity), but analogous issues in reinforcement learning
are not well understood, even in finite state/action (tabular) domains. We
introduce a new problem setting for reinforcement learning, the Exogenous
Markov Decision Process (ExoMDP), in which the state space admits an (unknown)
factorization into a small controllable (or, endogenous) component and a large
irrelevant (or, exogenous) component; the exogenous component is independent of
the learner's actions, but evolves in an arbitrary, temporally correlated
fashion. We provide a new algorithm, ExoRL, which learns a near-optimal policy
with sample complexity polynomial in the size of the endogenous component and
nearly independent of the size of the exogenous component, thereby offering a
doubly-exponential improvement over off-the-shelf algorithms. Our results
highlight for the first time that sample-efficient reinforcement learning is
possible in the presence of exogenous information, and provide a simple,
user-friendly benchmark for investigation going forward.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Unsupervised Spatial-Temporal Feature Enrichment and Fidelity
Preservation Network for Skeleton based Action Recognition [20.07820929037547]
Unsupervised skeleton based action recognition has achieved remarkable progress recently.
Existing unsupervised learning methods suffer from severe overfitting problem.
This paper presents an Unsupervised spatial-temporal Feature Enrichment and Fidelity Preservation framework to generate rich distributed features.
arXiv Detail & Related papers (2024-01-25T09:24:07Z) - Balancing Explainability-Accuracy of Complex Models [8.402048778245165]
We introduce a new approach for complex models based on the co-relation impact.
We propose approaches for both scenarios of independent features and dependent features.
We provide an upper bound of the complexity of our proposed approach for the dependent features.
arXiv Detail & Related papers (2023-05-23T14:20:38Z) - A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - The Distributed Information Bottleneck reveals the explanatory structure
of complex systems [1.52292571922932]
The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an output.
We show that a crucial modification -- distributing bottlenecks across multiple components of the input -- opens fundamentally new avenues for interpretable deep learning in science.
We demonstrate the Distributed IB's explanatory utility in systems drawn from applied mathematics and condensed matter physics.
arXiv Detail & Related papers (2022-04-15T17:59:35Z) - Provable Reinforcement Learning with a Short-Term Memory [68.00677878812908]
We study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length $m$.
In particular, in the rich-observation setting, we develop new algorithms using a novel "moment matching" approach with a sample complexity that scales exponentially.
Our results show that a short-term memory suffices for reinforcement learning in these environments.
arXiv Detail & Related papers (2022-02-08T16:39:57Z) - A Survey on Extraction of Causal Relations from Natural Language Text [9.317718453037667]
Cause-effect relations appear frequently in text, and curating cause-effect relations from text helps in building causal networks for predictive tasks.
Existing causality extraction techniques include knowledge-based, statistical machine learning(ML)-based, and deep learning-based approaches.
arXiv Detail & Related papers (2021-01-16T10:49:39Z) - Provably Efficient Exploration for Reinforcement Learning Using
Unsupervised Learning [96.78504087416654]
Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration in reinforcement learning (RL) problems, we investigate when this paradigm is provably efficient.
We present a general algorithmic framework that is built upon two components: an unsupervised learning algorithm and a noregret tabular RL algorithm.
arXiv Detail & Related papers (2020-03-15T19:23:59Z) - Multilinear Compressive Learning with Prior Knowledge [106.12874293597754]
Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system.
Key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task.
In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable?
arXiv Detail & Related papers (2020-02-17T19:06:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.