Empirically Verifying Hypotheses Using Reinforcement Learning
- URL: http://arxiv.org/abs/2006.15762v1
- Date: Mon, 29 Jun 2020 01:01:10 GMT
- Title: Empirically Verifying Hypotheses Using Reinforcement Learning
- Authors: Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta
- Abstract summary: This paper formulates hypothesis verification as an RL problem.
We aim to build an agent that, given a hypothesis about the dynamics of the world, can take actions to generate observations which can help predict whether the hypothesis is true or false.
- Score: 58.09414653169534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper formulates hypothesis verification as an RL problem. Specifically,
we aim to build an agent that, given a hypothesis about the dynamics of the
world, can take actions to generate observations which can help predict whether
the hypothesis is true or false. Existing RL algorithms fail to solve this
task, even for simple environments. In order to train the agents, we exploit
the underlying structure of many hypotheses, factorizing them as
{pre-condition, action sequence, post-condition} triplets. By leveraging this
structure we show that RL agents are able to succeed at the task. Furthermore,
subsequent fine-tuning of the policies allows the agent to correctly verify
hypotheses not amenable to the above factorization.
Related papers
- Resolving Multiple-Dynamic Model Uncertainty in Hypothesis-Driven Belief-MDPs [4.956709222278243]
We present a hypothesis-driven belief MDP that enables reasoning over multiple hypotheses.
We also present a new belief MDP that balances the goals of determining the (most likely) correct hypothesis and performing well in the underlying POMDP.
arXiv Detail & Related papers (2024-11-21T18:36:19Z) - Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models [4.9108308035618515]
Multi-agent reinforcement learning (MARL) methods struggle with the non-stationarity of multi-agent systems.
Here, we leverage large language models (LLMs) to create an autonomous agent that can handle these challenges.
Our agent, Hypothetical Minds, consists of a cognitively-inspired architecture, featuring modular components for perception, memory, and hierarchical planning over two levels of abstraction.
arXiv Detail & Related papers (2024-07-09T17:57:15Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Nested Counterfactual Identification from Arbitrary Surrogate
Experiments [95.48089725859298]
We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments.
Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
arXiv Detail & Related papers (2021-07-07T12:51:04Z) - What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm"
We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z) - A General Framework for Distributed Inference with Uncertain Models [14.8884251609335]
We study the problem of distributed classification with a network of heterogeneous agents.
We build upon the concept of uncertain models to incorporate the agents' uncertainty in the likelihoods.
arXiv Detail & Related papers (2020-11-20T22:17:12Z) - Weakly Supervised Disentangled Generative Causal Representation Learning [21.392372783459013]
We show that previous methods with independent priors fail to disentangle causally related factors even under supervision.
We propose a new disentangled learning method that enables causal controllable generation and causal representation learning.
arXiv Detail & Related papers (2020-10-06T11:38:41Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.