Related papers: In-Context Learning for Pure Exploration in Continuous Spaces

In-Context Learning for Pure Exploration in Continuous Spaces

URL: http://arxiv.org/abs/2602.17976v1
Date: Fri, 20 Feb 2026 04:20:47 GMT
Title: In-Context Learning for Pure Exploration in Continuous Spaces
Authors: Alessio Russo, Yin-Ching Lee, Ryan Welch, Aldo Pacchiano,
Abstract summary: In active sequential testing, also termed pure exploration, a learner is tasked with the goal to adaptively acquire information.<n>We introduce C-ICPE-TS, an algorithm that meta-trains deep neural policies to map observation histories to the next continuous query action.<n>At inference time, C-ICPE-TS actively gathers evidence on previously unseen tasks and infers the true hypothesis without parameter updates or explicit hand-crafted information models.
Score: 26.001092687873125
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In active sequential testing, also termed pure exploration, a learner is tasked with the goal to adaptively acquire information so as to identify an unknown ground-truth hypothesis with as few queries as possible. This problem, originally studied by Chernoff in 1959, has several applications: classical formulations include Best-Arm Identification (BAI) in bandits, where actions index hypotheses, and generalized search problems, where strategically chosen queries reveal partial information about a hidden label. In many modern settings, however, the hypothesis space is continuous and naturally coincides with the query/action space: for example, identifying an optimal action in a continuous-armed bandit, localizing an $ε$-ball contained in a target region, or estimating the minimizer of an unknown function from a sequence of observations. In this work, we study pure exploration in such continuous spaces and introduce Continuous In-Context Pure Exploration for this regime. We introduce C-ICPE-TS, an algorithm that meta-trains deep neural policies to map observation histories to (i) the next continuous query action and (ii) a predicted hypothesis, thereby learning transferable sequential testing strategies directly from data. At inference time, C-ICPE-TS actively gathers evidence on previously unseen tasks and infers the true hypothesis without parameter updates or explicit hand-crafted information models. We validate C-ICPE-TS across a range of benchmarks, spanning continuous best-arm identification, region localization, and function minimizer identification.

Related papers

In-Context Learning for Pure Exploration [28.404325855738502]
We study the problem active sequential hypothesis testing, also known as pure exploration.<n>We introduce In-Context Pure Exploration (ICPE), which meta-trains Transformers to map observation histories to query actions and a predicted hypothesis.<n>ICPE actively gathers evidence on new tasks and infers the true hypothesis without parameter updates.
arXiv Detail & Related papers (2025-06-02T17:04:50Z)
Chain-of-Retrieval Augmented Generation [91.02950964802454]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z)
BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping [64.8477128397529]
We propose a training-required and training-free test-time adaptation framework. We maintain a light-weight key-value memory for feature retrieval from instance-agnostic historical samples and instance-aware boosting samples. We theoretically justify the rationality behind our method and empirically verify its effectiveness on both the out-of-distribution and the cross-domain datasets.
arXiv Detail & Related papers (2024-10-20T15:58:43Z)
A Survey on Deep Learning-based Spatio-temporal Action Detection [8.456482280676884]
STAD aims to classify the actions present in a video and localize them in space and time. It has become a particularly active area of research in computer vision because of its explosively emerging real-world applications. This paper provides a comprehensive review of the state-of-the-art deep learning-based methods for STAD.
arXiv Detail & Related papers (2023-08-03T08:48:14Z)
Sequential Attention Source Identification Based on Feature Representation [88.05527934953311]
This paper proposes a sequence-to-sequence based localization framework called Temporal-sequence based Graph Attention Source Identification (TGASI) based on an inductive learning idea. It's worth mentioning that the inductive learning idea ensures that TGASI can detect the sources in new scenarios without knowing other prior knowledge.
arXiv Detail & Related papers (2023-06-28T03:00:28Z)
How to Construct Perfect and Worse-than-Coin-Flip Spoofing Countermeasures: A Word of Warning on Shortcut Learning [20.486639064376014]
Shortcut learning, or Clever Hans effect refers to situations where a learning agent learns spurious correlations present in data, resulting in biased models. We focus on finding shortcuts in deep learning based spoofing countermeasures (CMs) that predict whether a given utterance is spoofed or not.
arXiv Detail & Related papers (2023-05-31T15:58:37Z)
Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery [76.63807209414789]
We challenge the status quo in class-iNCD and propose a learning paradigm where class discovery occurs continuously and truly unsupervisedly. We propose simple baselines, composed of a frozen PTM backbone and a learnable linear classifier, that are not only simple to implement but also resilient under longer learning scenarios.
arXiv Detail & Related papers (2023-03-28T13:47:16Z)
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal. Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z)
Topological Data Analysis (TDA) Techniques Enhance Hand Pose Classification from ECoG Neural Recordings [0.0]
We introduce topological descriptors of time series data to enhance hand pose classification. We observe robust results in terms of ac-curacy for a four-labels classification problem, with limited available data.
arXiv Detail & Related papers (2021-10-09T22:04:43Z)
Generalized Chernoff Sampling for Active Testing, Active Regression and Structured Bandit Algorithms [16.19565714525819]
This paper studies active learning and best-arm identification in structured bandit settings. We obtain a novel sample bound complexity for Chernoff's original active testing procedure.
arXiv Detail & Related papers (2020-12-15T03:44:18Z)
Open-set Short Utterance Forensic Speaker Verification using Teacher-Student Network with Explicit Inductive Bias [59.788358876316295]
We propose a pipeline solution to improve speaker verification on a small actual forensic field dataset. By leveraging large-scale out-of-domain datasets, a knowledge distillation based objective function is proposed for teacher-student learning. We show that the proposed objective function can efficiently improve the performance of teacher-student learning on short utterances.
arXiv Detail & Related papers (2020-09-21T00:58:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.