Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs
- URL: http://arxiv.org/abs/2404.14552v1
- Date: Mon, 22 Apr 2024 19:46:16 GMT
- Title: Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs
- Authors: Lili Wu, Ben Evans, Riashat Islam, Raihan Seraj, Yonathan Efroni, Alex Lamb,
- Abstract summary: We study the problem of discovering an informative, or agent-centric, state representation that encodes only the relevant information while discarding the irrelevant.
Our results include theory in the deterministic dynamics setting as well as counter-examples for alternative intuitive algorithms.
We show that these can be a double-edged sword: making the algorithms more successful when used correctly and causing dramatic failure when used incorrectly.
- Score: 23.584313644411967
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discovering an informative, or agent-centric, state representation that encodes only the relevant information while discarding the irrelevant is a key challenge towards scaling reinforcement learning algorithms and efficiently applying them to downstream tasks. Prior works studied this problem in high-dimensional Markovian environments, when the current observation may be a complex object but is sufficient to decode the informative state. In this work, we consider the problem of discovering the agent-centric state in the more challenging high-dimensional non-Markovian setting, when the state can be decoded from a sequence of past observations. We establish that generalized inverse models can be adapted for learning agent-centric state representation for this task. Our results include asymptotic theory in the deterministic dynamics setting as well as counter-examples for alternative intuitive algorithms. We complement these findings with a thorough empirical study on the agent-centric state discovery abilities of the different alternatives we put forward. Particularly notable is our analysis of past actions, where we show that these can be a double-edged sword: making the algorithms more successful when used correctly and causing dramatic failure when used incorrectly.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Predictive Coding beyond Correlations [59.47245250412873]
We show how one of such algorithms, called predictive coding, is able to perform causal inference tasks.
First, we show how a simple change in the inference process of predictive coding enables to compute interventions without the need to mutilate or redefine a causal graph.
arXiv Detail & Related papers (2023-06-27T13:57:16Z) - Self-Supervised Likelihood Estimation with Energy Guidance for Anomaly
Segmentation in Urban Scenes [42.66864386405585]
We design an energy-guided self-supervised framework for anomaly segmentation.
We exploit the strong context-dependent nature of the segmentation task.
Based on the proposed estimators, we devise an adaptive self-supervised training framework.
arXiv Detail & Related papers (2023-02-14T03:54:32Z) - Framing Algorithmic Recourse for Anomaly Detection [18.347886926848563]
We present an approach -- Context preserving Algorithmic Recourse for Anomalies in Tabular data (CARAT)
CARAT uses a transformer based encoder-decoder model to explain an anomaly by finding features with low likelihood.
Semantically coherent counterfactuals are generated by modifying the highlighted features, using the overall context of features in the anomalous instance(s)
arXiv Detail & Related papers (2022-06-29T03:30:51Z) - Masked prediction tasks: a parameter identifiability view [49.533046139235466]
We focus on the widely used self-supervised learning method of predicting masked tokens.
We show that there is a rich landscape of possibilities, out of which some prediction tasks yield identifiability, while others do not.
arXiv Detail & Related papers (2022-02-18T17:09:32Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Unsupervised Disentanglement without Autoencoding: Pitfalls and Future
Directions [21.035001142156464]
Disentangled visual representations have largely been studied with generative models such as Variational AutoEncoders (VAEs)
We explore regularization methods with contrastive learning, which could result in disentangled representations powerful enough for large scale datasets and downstream applications.
We evaluate disentanglement with downstream tasks, analyze the benefits and disadvantages of each regularization used, and discuss future directions.
arXiv Detail & Related papers (2021-08-14T21:06:42Z) - Sequential Transfer in Reinforcement Learning with a Generative Model [48.40219742217783]
We show how to reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.
We derive PAC bounds on its sample complexity which clearly demonstrate the benefits of using this kind of prior knowledge.
We empirically verify our theoretical findings in simple simulated domains.
arXiv Detail & Related papers (2020-07-01T19:53:35Z) - Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP)
We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.