Empirically Verifying Hypotheses Using Reinforcement Learning
- URL: http://arxiv.org/abs/2006.15762v1
- Date: Mon, 29 Jun 2020 01:01:10 GMT
- Title: Empirically Verifying Hypotheses Using Reinforcement Learning
- Authors: Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta
- Abstract summary: This paper formulates hypothesis verification as an RL problem.
We aim to build an agent that, given a hypothesis about the dynamics of the world, can take actions to generate observations which can help predict whether the hypothesis is true or false.
- Score: 58.09414653169534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper formulates hypothesis verification as an RL problem. Specifically,
we aim to build an agent that, given a hypothesis about the dynamics of the
world, can take actions to generate observations which can help predict whether
the hypothesis is true or false. Existing RL algorithms fail to solve this
task, even for simple environments. In order to train the agents, we exploit
the underlying structure of many hypotheses, factorizing them as
{pre-condition, action sequence, post-condition} triplets. By leveraging this
structure we show that RL agents are able to succeed at the task. Furthermore,
subsequent fine-tuning of the policies allows the agent to correctly verify
hypotheses not amenable to the above factorization.
Related papers
- Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models [76.6028674686018]
We introduce thought-tracing, an inference-time reasoning algorithm to trace the mental states of agents.
Our algorithm is modeled after the Bayesian theory-of-mind framework.
We evaluate thought-tracing on diverse theory-of-mind benchmarks, demonstrating significant performance improvements.
arXiv Detail & Related papers (2025-02-17T15:08:50Z) - Automated Hypothesis Validation with Agentic Sequential Falsifications [45.572893831500686]
Many real-world hypotheses are abstract, high-level statements that are difficult to validate directly.
Here we propose Popper, an agentic framework for rigorous automated validation of free-form hypotheses.
arXiv Detail & Related papers (2025-02-14T01:46:00Z) - Resolving Multiple-Dynamic Model Uncertainty in Hypothesis-Driven Belief-MDPs [4.956709222278243]
We present a hypothesis-driven belief MDP that enables reasoning over multiple hypotheses.
We also present a new belief MDP that balances the goals of determining the (most likely) correct hypothesis and performing well in the underlying POMDP.
arXiv Detail & Related papers (2024-11-21T18:36:19Z) - Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models [4.9108308035618515]
Multi-agent reinforcement learning (MARL) methods struggle with the non-stationarity of multi-agent systems.
Here, we leverage large language models (LLMs) to create an autonomous agent that can handle these challenges.
Our agent, Hypothetical Minds, consists of a cognitively-inspired architecture, featuring modular components for perception, memory, and hierarchical planning over two levels of abstraction.
arXiv Detail & Related papers (2024-07-09T17:57:15Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Nested Counterfactual Identification from Arbitrary Surrogate
Experiments [95.48089725859298]
We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments.
Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
arXiv Detail & Related papers (2021-07-07T12:51:04Z) - What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm"
We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z) - A General Framework for Distributed Inference with Uncertain Models [14.8884251609335]
We study the problem of distributed classification with a network of heterogeneous agents.
We build upon the concept of uncertain models to incorporate the agents' uncertainty in the likelihoods.
arXiv Detail & Related papers (2020-11-20T22:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.