Related papers: Empirically Verifying Hypotheses Using Reinforcement Learning

Empirically Verifying Hypotheses Using Reinforcement Learning

URL: http://arxiv.org/abs/2006.15762v1
Date: Mon, 29 Jun 2020 01:01:10 GMT
Title: Empirically Verifying Hypotheses Using Reinforcement Learning
Authors: Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta
Abstract summary: This paper formulates hypothesis verification as an RL problem. We aim to build an agent that, given a hypothesis about the dynamics of the world, can take actions to generate observations which can help predict whether the hypothesis is true or false.
Score: 58.09414653169534
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper formulates hypothesis verification as an RL problem. Specifically, we aim to build an agent that, given a hypothesis about the dynamics of the world, can take actions to generate observations which can help predict whether the hypothesis is true or false. Existing RL algorithms fail to solve this task, even for simple environments. In order to train the agents, we exploit the underlying structure of many hypotheses, factorizing them as {pre-condition, action sequence, post-condition} triplets. By leveraging this structure we show that RL agents are able to succeed at the task. Furthermore, subsequent fine-tuning of the policies allows the agent to correctly verify hypotheses not amenable to the above factorization.

Related papers

Bayes-Entropy Collaborative Driven Agents for Research Hypotheses Generation and Optimization [4.469102316542763]
This paper proposes a multi-agent collaborative framework called HypoAgents.<n>It generates hypotheses through diversity sampling and establishes prior beliefs.<n>It then employs etrieval-augmented generation (RAG) to gather external literature evidence.<n>It identifies high-uncertainty hypotheses using information entropy $H = - sum p_ilog p_i$ and actively refines them.
arXiv Detail & Related papers (2025-08-03T13:05:32Z)
Controllable Logical Hypothesis Generation for Abductive Reasoning in Knowledge Graphs [54.596180382762036]
Abductive reasoning in knowledge graphs aims to generate plausible logical hypotheses from observed entities.<n>Due to a lack of controllability, a single observation may yield numerous plausible but redundant or irrelevant hypotheses.<n>We introduce the task of controllable hypothesis generation to improve the practical utility of abductive reasoning.
arXiv Detail & Related papers (2025-05-27T09:36:47Z)
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models [76.6028674686018]
We introduce thought-tracing, an inference-time reasoning algorithm to trace the mental states of agents. Our algorithm is modeled after the Bayesian theory-of-mind framework. We evaluate thought-tracing on diverse theory-of-mind benchmarks, demonstrating significant performance improvements.
arXiv Detail & Related papers (2025-02-17T15:08:50Z)
Automated Hypothesis Validation with Agentic Sequential Falsifications [45.572893831500686]
Many real-world hypotheses are abstract, high-level statements that are difficult to validate directly. Here we propose Popper, an agentic framework for rigorous automated validation of free-form hypotheses.
arXiv Detail & Related papers (2025-02-14T01:46:00Z)
Resolving Multiple-Dynamic Model Uncertainty in Hypothesis-Driven Belief-MDPs [4.956709222278243]
We present a hypothesis-driven belief MDP that enables reasoning over multiple hypotheses. We also present a new belief MDP that balances the goals of determining the (most likely) correct hypothesis and performing well in the underlying POMDP.
arXiv Detail & Related papers (2024-11-21T18:36:19Z)
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models [4.9108308035618515]
Multi-agent reinforcement learning (MARL) methods struggle with the non-stationarity of multi-agent systems. Here, we leverage large language models (LLMs) to create an autonomous agent that can handle these challenges. Our agent, Hypothetical Minds, consists of a cognitively-inspired architecture, featuring modular components for perception, memory, and hierarchical planning over two levels of abstraction.
arXiv Detail & Related papers (2024-07-09T17:57:15Z)
Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data. This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis. To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Nested Counterfactual Identification from Arbitrary Surrogate Experiments [95.48089725859298]
We study the identification of nested counterfactuals from an arbitrary combination of observations and experiments. Specifically, we prove the counterfactual unnesting theorem (CUT), which allows one to map arbitrary nested counterfactuals to unnested ones.
arXiv Detail & Related papers (2021-07-07T12:51:04Z)
What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm" We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z)
A General Framework for Distributed Inference with Uncertain Models [14.8884251609335]
We study the problem of distributed classification with a network of heterogeneous agents. We build upon the concept of uncertain models to incorporate the agents' uncertainty in the likelihoods.
arXiv Detail & Related papers (2020-11-20T22:17:12Z)
Weakly Supervised Disentangled Generative Causal Representation Learning [21.392372783459013]
We show that previous methods with independent priors fail to disentangle causally related factors even under supervision. We propose a new disentangled learning method that enables causal controllable generation and causal representation learning.
arXiv Detail & Related papers (2020-10-06T11:38:41Z)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent. We derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.