The Partially Observable History Process
- URL: http://arxiv.org/abs/2111.08102v1
- Date: Mon, 15 Nov 2021 22:00:14 GMT
- Title: The Partially Observable History Process
- Authors: Dustin Morrill, Amy R. Greenwald, Michael Bowling
- Abstract summary: We introduce the partially observable history process (POHP) formalism for reinforcement learning.
POHP centers around actions and observations of a single agent and abstracts away the presence of other players.
Our formalism provides a streamlined interface for designing algorithms that defy categorization as exclusively single or multi-agent.
- Score: 17.08883385550155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the partially observable history process (POHP) formalism for
reinforcement learning. POHP centers around the actions and observations of a
single agent and abstracts away the presence of other players without reducing
them to stochastic processes. Our formalism provides a streamlined interface
for designing algorithms that defy categorization as exclusively single or
multi-agent, and for developing theory that applies across these domains. We
show how the POHP formalism unifies traditional models including the Markov
decision process, the Markov game, the extensive-form game, and their partially
observable extensions, without introducing burdensome technical machinery or
violating the philosophical underpinnings of reinforcement learning. We
illustrate the utility of our formalism by concisely exploring observable
sequential rationality, re-deriving the extensive-form regret minimization
(EFR) algorithm, and examining EFR's theoretical properties in greater
generality.
Related papers
- Sparks of Explainability: Recent Advancements in Explaining Large Vision Models [6.1642231492615345]
This thesis explores advanced approaches to improve explainability in computer vision by analyzing and modeling the features exploited by deep neural networks.
It evaluates attribution methods, notably saliency maps, by introducing a metric based on algorithmic stability and an approach utilizing Sobol indices.
Two hypotheses are examined: aligning models with human reasoning and adopting a conceptual explainability approach.
arXiv Detail & Related papers (2025-02-03T04:49:32Z) - BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning [78.63421517563056]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks.
We present a unified probabilistic framework that formalizes LLM reasoning through a novel graphical model.
We introduce the Bootstrapping Reinforced Thinking Process (BRiTE) algorithm, which works in two steps.
arXiv Detail & Related papers (2025-01-31T02:39:07Z) - Axiomatic Causal Interventions for Reverse Engineering Relevance Computation in Neural Retrieval Models [20.29451537633895]
We propose the use of causal interventions to reverse engineer neural rankers.
We demonstrate how mechanistic interpretability methods can be used to isolate components satisfying term-frequency axioms.
arXiv Detail & Related papers (2024-05-03T22:30:15Z) - Bridging State and History Representations: Understanding Self-Predictive RL [24.772140132462468]
Representations are at the core of all deep reinforcement learning (RL) methods for Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs)
We show that many of these seemingly distinct methods and frameworks for state and history abstractions are, in fact, based on a common idea of self-predictive abstraction.
We provide theoretical insights into the widely adopted objectives and optimization, such as the stop-gradient technique, in learning self-predictive representations.
arXiv Detail & Related papers (2024-01-17T00:47:43Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Understanding Masked Autoencoders via Hierarchical Latent Variable
Models [109.35382136147349]
Masked autoencoder (MAE) has recently achieved prominent success in a variety of vision tasks.
Despite the emergence of intriguing empirical observations on MAE, a theoretically principled understanding is still lacking.
arXiv Detail & Related papers (2023-06-08T03:00:10Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - A Free Lunch from the Noise: Provable and Practical Exploration for
Representation Learning [55.048010996144036]
We show that under some noise assumption, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free.
We propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise.
arXiv Detail & Related papers (2021-11-22T19:24:57Z) - This looks more like that: Enhancing Self-Explaining Models by
Prototypical Relevance Propagation [17.485732906337507]
We present a case study of the self-explaining network, ProtoPNet, in the presence of a spectrum of artifacts.
We introduce a novel method for generating more precise model-aware explanations.
In order to obtain a clean dataset, we propose to use multi-view clustering strategies for segregating the artifact images.
arXiv Detail & Related papers (2021-08-27T09:55:53Z) - On Contrastive Representations of Stochastic Processes [53.21653429290478]
Learning representations of processes is an emerging problem in machine learning.
We show that our methods are effective for learning representations of periodic functions, 3D objects and dynamical processes.
arXiv Detail & Related papers (2021-06-18T11:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.