Related papers: Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

URL: http://arxiv.org/abs/2206.04282v1
Date: Thu, 9 Jun 2022 05:19:32 GMT
Title: Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information
Authors: Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy and John Langford
Abstract summary: In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
Score: 77.19830787312743
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. Learning from high-dimensional observations has been the subject of extensive investigation in supervised learning and statistics (e.g., via sparsity), but analogous issues in reinforcement learning are not well understood, even in finite state/action (tabular) domains. We introduce a new problem setting for reinforcement learning, the Exogenous Markov Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable (or, endogenous) component and a large irrelevant (or, exogenous) component; the exogenous component is independent of the learner's actions, but evolves in an arbitrary, temporally correlated fashion. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity polynomial in the size of the endogenous component and nearly independent of the size of the exogenous component, thereby offering a doubly-exponential improvement over off-the-shelf algorithms. Our results highlight for the first time that sample-efficient reinforcement learning is possible in the presence of exogenous information, and provide a simple, user-friendly benchmark for investigation going forward.

Related papers

Learning to Explore: An In-Context Learning Approach for Pure Exploration [23.16863295063427]
In this work, we study the active sequential hypothesis testing problem, also known as pure exploration.<n>We introduce In-Context Pure Exploration (ICPE), an in-context learning approach that uses Transformers to learn exploration strategies directly from experience.<n>ICPE combines supervised learning and reinforcement learning to identify and exploit latent structure across related tasks, without requiring prior assumptions.
arXiv Detail & Related papers (2025-06-02T17:04:50Z)
CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning [62.69917996026769]
A class-incremental learning task requires learning and preserving both spatial appearance and temporal action involvement. We propose a framework that equips separate adapters to learn new class patterns, accommodating the incremental information requirements unique to each class. A causal compensation mechanism is proposed to reduce the conflicts during increment and memorization for between different types of information.
arXiv Detail & Related papers (2025-01-13T11:34:55Z)
A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
Unsupervised Spatial-Temporal Feature Enrichment and Fidelity Preservation Network for Skeleton based Action Recognition [20.07820929037547]
Unsupervised skeleton based action recognition has achieved remarkable progress recently. Existing unsupervised learning methods suffer from severe overfitting problem. This paper presents an Unsupervised spatial-temporal Feature Enrichment and Fidelity Preservation framework to generate rich distributed features.
arXiv Detail & Related papers (2024-01-25T09:24:07Z)
Balancing Explainability-Accuracy of Complex Models [8.402048778245165]
We introduce a new approach for complex models based on the co-relation impact. We propose approaches for both scenarios of independent features and dependent features. We provide an upper bound of the complexity of our proposed approach for the dependent features.
arXiv Detail & Related papers (2023-05-23T14:20:38Z)
A Survey of Learning on Small Data: Generalization, Optimization, and Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI. This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data. Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z)
The Distributed Information Bottleneck reveals the explanatory structure of complex systems [1.52292571922932]
The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an output. We show that a crucial modification -- distributing bottlenecks across multiple components of the input -- opens fundamentally new avenues for interpretable deep learning in science. We demonstrate the Distributed IB's explanatory utility in systems drawn from applied mathematics and condensed matter physics.
arXiv Detail & Related papers (2022-04-15T17:59:35Z)
What Makes Good Contrastive Learning on Small-Scale Wearable-based Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task. This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z)
Provable Reinforcement Learning with a Short-Term Memory [68.00677878812908]
We study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length $m$. In particular, in the rich-observation setting, we develop new algorithms using a novel "moment matching" approach with a sample complexity that scales exponentially. Our results show that a short-term memory suffices for reinforcement learning in these environments.
arXiv Detail & Related papers (2022-02-08T16:39:57Z)
A Survey on Extraction of Causal Relations from Natural Language Text [9.317718453037667]
Cause-effect relations appear frequently in text, and curating cause-effect relations from text helps in building causal networks for predictive tasks. Existing causality extraction techniques include knowledge-based, statistical machine learning(ML)-based, and deep learning-based approaches.
arXiv Detail & Related papers (2021-01-16T10:49:39Z)
Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning [96.78504087416654]
Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration in reinforcement learning (RL) problems, we investigate when this paradigm is provably efficient. We present a general algorithmic framework that is built upon two components: an unsupervised learning algorithm and a noregret tabular RL algorithm.
arXiv Detail & Related papers (2020-03-15T19:23:59Z)
Multilinear Compressive Learning with Prior Knowledge [106.12874293597754]
Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system. Key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task. In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable?
arXiv Detail & Related papers (2020-02-17T19:06:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.