Related papers: Can foundation models actively gather information in interactive environments to test hypotheses?

Can foundation models actively gather information in interactive environments to test hypotheses?

URL: http://arxiv.org/abs/2412.06438v1
Date: Mon, 09 Dec 2024 12:27:21 GMT
Title: Can foundation models actively gather information in interactive environments to test hypotheses?
Authors: Nan Rosemary Ke, Danny P. Sawyer, Hubert Soyer, Martin Engelcke, David P Reichert, Drew A. Hudson, John Reid, Alexander Lerchner, Danilo Jimenez Rezende, Timothy P Lillicrap, Michael Mozer, Jane X Wang,
Abstract summary: We introduce a framework in which a model must determine the factors influencing a hidden reward function.<n>We investigate whether approaches such as self- throughput and increased inference time improve information gathering efficiency.
Score: 56.651636971591536
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While problem solving is a standard evaluation task for foundation models, a crucial component of problem solving -- actively and strategically gathering information to test hypotheses -- has not been closely investigated. To assess the information gathering abilities of foundation models in interactive environments, we introduce a framework in which a model must determine the factors influencing a hidden reward function by iteratively reasoning about its previously gathered information and proposing its next exploratory action to maximize information gain at each step. We implement this framework in both a text-based environment, which offers a tightly controlled setting and enables high-throughput parameter sweeps, and in an embodied 3D environment, which requires addressing complexities of multi-modal interaction more relevant to real-world applications. We further investigate whether approaches such as self-correction and increased inference time improve information gathering efficiency. In a relatively simple task that requires identifying a single rewarding feature, we find that LLM's information gathering capability is close to optimal. However, when the model must identify a conjunction of rewarding features, performance is suboptimal. The hit in performance is due partly to the model translating task description to a policy and partly to the model's effectiveness in using its in-context memory. Performance is comparable in both text and 3D embodied environments, although imperfect visual object recognition reduces its accuracy in drawing conclusions from gathered information in the 3D embodied case. For single-feature-based rewards, we find that smaller models curiously perform better; for conjunction-based rewards, incorporating self correction into the model improves performance.

Related papers

Do Retrieval-Augmented Language Models Adapt to Varying User Needs? [28.729041459278587]
This paper introduces a novel evaluation framework that systematically assesses RALMs under three user need cases. By varying both user instructions and the nature of retrieved information, our approach captures the complexities of real-world applications. Our findings highlight the necessity of user-centric evaluations in the development of retrieval-augmented systems.
arXiv Detail & Related papers (2025-02-27T05:39:38Z)
Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions. Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes. We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z)
SMamba: Sparse Mamba for Event-based Object Detection [17.141967728323714]
Transformer-based methods have achieved remarkable performance in event-based object detection, owing to the global modeling ability. To mitigate cost, some researchers propose window attention based sparsification strategies to discard unimportant regions. We propose Sparse Mamba, which performs adaptive sparsification to reduce computational effort while maintaining global modeling ability.
arXiv Detail & Related papers (2025-01-21T08:33:32Z)
A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA) CEFA consists of a feature alignment module and a context enhancement module. Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z)
Relational Object-Centric Actor-Critic [44.99833362998488]
Recent works highlight that disentangled object representations can aid policy learning in image-based, object-centric reinforcement learning tasks. This paper proposes a novel object-centric reinforcement learning algorithm that integrates actor-critic and model-based approaches. We evaluate our method in a simulated 3D robotic environment and a 2D environment with compositional structure.
arXiv Detail & Related papers (2023-10-26T06:05:12Z)
Active Sensing with Predictive Coding and Uncertainty Minimization [0.0]
We present an end-to-end procedure for embodied exploration inspired by two biological computations. We first demonstrate our approach in a maze navigation task and show that it can discover the underlying transition distributions and spatial features of the environment. We show that our model builds unsupervised representations through exploration that allow it to efficiently categorize visual scenes.
arXiv Detail & Related papers (2023-07-02T21:14:49Z)
Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints. This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks. Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z)
Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances. We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z)
Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation. The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z)
Sim2Real Object-Centric Keypoint Detection and Description [40.58367357980036]
Keypoint detection and description play a central role in computer vision. We propose the object-centric formulation, which requires further identifying which object each interest point belongs to. We develop a sim2real contrastive learning mechanism that can generalize the model trained in simulation to real-world applications.
arXiv Detail & Related papers (2022-02-01T15:00:20Z)
Online reinforcement learning with sparse rewards through an active inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future. Our model is capable of solving sparse-reward problems with a very high sample efficiency. We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z)
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment [73.9469267445146]
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor pose significant sample-efficiency challenges for reinforcement learning agents. We show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task.
arXiv Detail & Related papers (2020-10-28T19:27:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.