Active Exploration via Autoregressive Generation of Missing Data
- URL: http://arxiv.org/abs/2405.19466v3
- Date: Wed, 05 Feb 2025 10:13:43 GMT
- Title: Active Exploration via Autoregressive Generation of Missing Data
- Authors: Tiffany Tianhui Cai, Hongseok Namkoong, Daniel Russo, Kelly W Zhang,
- Abstract summary: We pose uncertainty quantification and exploration in online decision-making as a problem of training and generation from an autoregressive sequence model.
Our approach rests on viewing uncertainty as arising from missing future outcomes that would be revealed through appropriate action choices.
- Score: 11.713451719120707
- License:
- Abstract: We pose uncertainty quantification and exploration in online decision-making as a problem of training and generation from an autoregressive sequence model, an area experiencing rapid innovation. Our approach rests on viewing uncertainty as arising from missing future outcomes that would be revealed through appropriate action choices, rather than from unobservable latent parameters of the environment. This reformulation aligns naturally with modern machine learning capabilities: we can i) train generative models through next-outcome prediction rather than fit explicit priors, ii) assess uncertainty through autoregressive generation rather than parameter sampling, and iii) adapt to new information through in-context learning rather than explicit posterior updating. To showcase these ideas, we formulate a challenging meta-bandit problem where effective performance requires leveraging unstructured prior information (like text features) while exploring judiciously to resolve key remaining uncertainties. We validate our approach through both theory and experiments. Our theory establishes a reduction, showing success at offline next-outcome prediction translates to reliable online uncertainty quantification and decision-making, even with strategically collected data. Semi-synthetic experiments show our insights bear out in a news-article recommendation task, where article text can be leveraged to minimize exploration.
Related papers
- SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets [32.496818080222646]
We propose a new approach to model-based offline reinforcement learning.
We provide a theoretical guarantee of model uncertainty and performance bound of SeMOPO.
Experimental results show that our method substantially outperforms all baseline methods.
arXiv Detail & Related papers (2024-06-13T15:16:38Z) - Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond [13.867793835583463]
We propose an uncertainty-aware memory-based approach to solve catastrophic forgetting.
We retrieve samples with specific characteristics, and - by retraining the model on such samples - we demonstrate the potential of this approach.
arXiv Detail & Related papers (2024-05-29T09:29:39Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Segue: Side-information Guided Generative Unlearnable Examples for
Facial Privacy Protection in Real World [64.4289385463226]
We propose Segue: Side-information guided generative unlearnable examples.
To improve transferability, we introduce side information such as true labels and pseudo labels.
It can resist JPEG compression, adversarial training, and some standard data augmentations.
arXiv Detail & Related papers (2023-10-24T06:22:37Z) - Uncertainty-driven Exploration Strategies for Online Grasp Learning [43.88491290121489]
We present an uncertainty-based approach for online learning of grasp predictions for robotic bin picking.
Specifically, the online learning algorithm with an effective exploration strategy can significantly improve its adaptation performance to unseen environment settings.
arXiv Detail & Related papers (2023-09-21T13:06:03Z) - Amortised Inference in Bayesian Neural Networks [0.0]
We introduce the Amortised Pseudo-Observation Variational Inference Bayesian Neural Network (APOVI-BNN)
We show that the amortised inference is of similar or better quality to those obtained through traditional variational inference.
We then discuss how the APOVI-BNN may be viewed as a new member of the neural process family.
arXiv Detail & Related papers (2023-09-06T14:02:33Z) - Re-thinking Data Availablity Attacks Against Deep Neural Networks [53.64624167867274]
In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective.
We introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements.
arXiv Detail & Related papers (2023-05-18T04:03:51Z) - Toward Reliable Human Pose Forecasting with Uncertainty [51.628234388046195]
We develop an open-source library for human pose forecasting, including multiple models, supporting several datasets.
We devise two types of uncertainty in the problem to increase performance and convey better trust.
arXiv Detail & Related papers (2023-04-13T17:56:08Z) - Robust Deep Learning for Autonomous Driving [0.0]
We introduce a new criterion to reliably estimate model confidence: the true class probability ( TCP)
Since the true class is by essence unknown at test time, we propose to learn TCP criterion from data with an auxiliary model, introducing a specific learning scheme adapted to this context.
We tackle the challenge of jointly detecting misclassification and out-of-distributions samples by introducing a new uncertainty measure based on evidential models and defined on the simplex.
arXiv Detail & Related papers (2022-11-14T22:07:11Z) - Near-optimal Offline Reinforcement Learning with Linear Representation:
Leveraging Variance Information with Pessimism [65.46524775457928]
offline reinforcement learning seeks to utilize offline/historical data to optimize sequential decision-making strategies.
We study the statistical limits of offline reinforcement learning with linear model representations.
arXiv Detail & Related papers (2022-03-11T09:00:12Z) - Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
Optimal Sample Complexity [51.476337785345436]
We study a pessimistic variant of Q-learning in the context of finite-horizon Markov decision processes.
A variance-reduced pessimistic Q-learning algorithm is proposed to achieve near-optimal sample complexity.
arXiv Detail & Related papers (2022-02-28T15:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.