Replicable Reinforcement Learning
- URL: http://arxiv.org/abs/2305.15284v4
- Date: Tue, 31 Oct 2023 17:13:14 GMT
- Title: Replicable Reinforcement Learning
- Authors: Eric Eaton, Marcel Hussing, Michael Kearns, Jessica Sorrell
- Abstract summary: We provide a provably replicable algorithm for parallel value iteration, and a provably replicable version of R-max in the episodic setting.
These are the first formal replicability results for control problems, which present different challenges for replication than batch learning settings.
- Score: 15.857503103543308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The replicability crisis in the social, behavioral, and data sciences has led
to the formulation of algorithm frameworks for replicability -- i.e., a
requirement that an algorithm produce identical outputs (with high probability)
when run on two different samples from the same underlying distribution. While
still in its infancy, provably replicable algorithms have been developed for
many fundamental tasks in machine learning and statistics, including
statistical query learning, the heavy hitters problem, and distribution
testing. In this work we initiate the study of replicable reinforcement
learning, providing a provably replicable algorithm for parallel value
iteration, and a provably replicable version of R-max in the episodic setting.
These are the first formal replicability results for control problems, which
present different challenges for replication than batch learning settings.
Related papers
- Replicability in High Dimensional Statistics [18.543059748500358]
We study the computational and statistical cost of replicability for several fundamental high dimensional statistical tasks.
Our main contribution establishes a computational and statistical equivalence between optimal replicable algorithms and high dimensional isoperimetrics.
arXiv Detail & Related papers (2024-06-04T00:06:42Z) - The Power of Resets in Online Reinforcement Learning [73.64852266145387]
We explore the power of simulators through online reinforcement learning with local simulator access (or, local planning)
We show that MDPs with low coverability can be learned in a sample-efficient fashion with only $Qstar$-realizability.
We show that the notorious Exogenous Block MDP problem is tractable under local simulator access.
arXiv Detail & Related papers (2024-04-23T18:09:53Z) - Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.
This paper investigates the robustness of existing CLTR models in complex and diverse situations.
We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z) - Batch Bayesian Optimization for Replicable Experimental Design [56.64902148159355]
Many real-world design problems evaluate multiple experimental conditions in parallel and replicate each condition multiple times due to large and heteroscedastic observation noise.
We propose the Batch Thompson Sampling for Replicable Experimental Design framework, which encompasses three algorithms.
We show the effectiveness of our algorithms in two practical real-world applications: precision agriculture and AutoML.
arXiv Detail & Related papers (2023-11-02T12:46:03Z) - Replicability and stability in learning [16.936594801109557]
Impagliazzo, Lei, Pitassi and Sorrell (22) recently initiated the study of replicability in machine learning.
We show how to boost any replicable algorithm so that it produces the same output with probability arbitrarily close to 1.
We prove that list replicability can be boosted so that it is achieved with probability arbitrarily close to 1.
arXiv Detail & Related papers (2023-04-07T17:52:26Z) - Stability is Stable: Connections between Replicability, Privacy, and
Adaptive Generalization [26.4468964378511]
A replicable algorithm gives the same output with high probability when its randomness is fixed.
Using replicable algorithms for data analysis can facilitate the verification of published results.
We establish new connections and separations between replicability and standard notions of algorithmic stability.
arXiv Detail & Related papers (2023-03-22T21:35:50Z) - Reproducibility in Learning [8.386806623480156]
A reproducible learning algorithm is resilient to variations in its samples.
Despite strong demand, there are efficient reproducible algorithms for several fundamental problems in statistics and learning.
arXiv Detail & Related papers (2022-01-20T19:59:11Z) - MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data.
MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism.
We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z) - Phase Retrieval using Expectation Consistent Signal Recovery Algorithm
based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems.
Recent advances in deep learning have opened up a new possibility for robust and fast PR.
We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - Statistically Guided Divide-and-Conquer for Sparse Factorization of
Large Matrix [2.345015036605934]
We formulate the statistical problem as a sparse factor regression and tackle it with a divide-conquer approach.
In the first stage division, we consider both latent parallel approaches for simplifying the task into a set of co-parsesparserank estimation (CURE) problems.
In the second stage division, we innovate a stagewise learning technique, consisting of a sequence simple incremental paths, to efficiently trace out the whole solution of CURE.
arXiv Detail & Related papers (2020-03-17T19:12:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.