Related papers: Residual Overfit Method of Exploration

Residual Overfit Method of Exploration

URL: http://arxiv.org/abs/2110.02919v1
Date: Wed, 6 Oct 2021 17:05:33 GMT
Title: Residual Overfit Method of Exploration
Authors: James McInerney, Nathan Kallus
Abstract summary: We propose an approximate exploration methodology based on fitting only two point estimates, one tuned and one overfit. The approach drives exploration towards actions where the overfit model exhibits the most overfitting compared to the tuned model. We compare ROME against a set of established contextual bandit methods on three datasets and find it to be one of the best performing.
Score: 78.07532520582313
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Exploration is a crucial aspect of bandit and reinforcement learning algorithms. The uncertainty quantification necessary for exploration often comes from either closed-form expressions based on simple models or resampling and posterior approximations that are computationally intensive. We propose instead an approximate exploration methodology based on fitting only two point estimates, one tuned and one overfit. The approach, which we term the residual overfit method of exploration (ROME), drives exploration towards actions where the overfit model exhibits the most overfitting compared to the tuned model. The intuition is that overfitting occurs the most at actions and contexts with insufficient data to form accurate predictions of the reward. We justify this intuition formally from both a frequentist and a Bayesian information theoretic perspective. The result is a method that generalizes to a wide variety of models and avoids the computational overhead of resampling or posterior approximations. We compare ROME against a set of established contextual bandit methods on three datasets and find it to be one of the best performing.

Related papers

In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems. Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z)
Predictive Coresets [0.0]
Conventional coreset approaches determine weights by minimizing the Kullback-Leibler divergence between the likelihood functions of the full and weighted datasets. We propose an alternative variational method which employs randomized posteriors and finds weights to match the unknown posterior predictive distributions conditioned on the full and reduced datasets. We evaluate the performance of the proposed coreset construction on diverse problems, including random partitions and density estimation.
arXiv Detail & Related papers (2025-02-08T23:57:43Z)
Likelihood approximations via Gaussian approximate inference [3.4991031406102238]
We propose efficient schemes to approximate the effects of non-Gaussian likelihoods by Gaussian densities. Our results attain good approximation quality for binary and multiclass classification in large-scale point-estimate and distributional inferential settings. As a by-product, we show that the proposed approximate log-likelihoods are a superior alternative to least-squares on raw labels for neural network classification.
arXiv Detail & Related papers (2024-10-28T05:39:26Z)
Model-Free Active Exploration in Reinforcement Learning [53.786439742572995]
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution. Our strategy is able to identify efficient policies faster than state-of-the-art exploration approaches.
arXiv Detail & Related papers (2024-06-30T19:00:49Z)
Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders [22.77397537980102]
We show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines.
arXiv Detail & Related papers (2024-03-13T20:16:21Z)
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal. Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z)
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization [73.04187954213471]
We introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval. The proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline.
arXiv Detail & Related papers (2022-11-14T14:25:40Z)
Posterior and Computational Uncertainty in Gaussian Processes [52.26904059556759]
Gaussian processes scale prohibitively with the size of the dataset. Many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. We develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended.
arXiv Detail & Related papers (2022-05-30T22:16:25Z)
Deep Learning Methods for Proximal Inference via Maximum Moment Restriction [0.0]
We introduce a flexible and scalable method based on a deep neural network to estimate causal effects in the presence of unmeasured confounding. Our method achieves state of the art performance on two well-established proximal inference benchmarks.
arXiv Detail & Related papers (2022-05-19T19:51:42Z)
Mean-Field Approximation to Gaussian-Softmax Integral with Application to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks. We use a mean-field approximation formula to compute an analytically intractable integral. Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling. We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.