Related papers: Replicable Reinforcement Learning

Replicable Reinforcement Learning

URL: http://arxiv.org/abs/2305.15284v4
Date: Tue, 31 Oct 2023 17:13:14 GMT
Title: Replicable Reinforcement Learning
Authors: Eric Eaton, Marcel Hussing, Michael Kearns, Jessica Sorrell
Abstract summary: We provide a provably replicable algorithm for parallel value iteration, and a provably replicable version of R-max in the episodic setting. These are the first formal replicability results for control problems, which present different challenges for replication than batch learning settings.
Score: 15.857503103543308
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The replicability crisis in the social, behavioral, and data sciences has led to the formulation of algorithm frameworks for replicability -- i.e., a requirement that an algorithm produce identical outputs (with high probability) when run on two different samples from the same underlying distribution. While still in its infancy, provably replicable algorithms have been developed for many fundamental tasks in machine learning and statistics, including statistical query learning, the heavy hitters problem, and distribution testing. In this work we initiate the study of replicable reinforcement learning, providing a provably replicable algorithm for parallel value iteration, and a provably replicable version of R-max in the episodic setting. These are the first formal replicability results for control problems, which present different challenges for replication than batch learning settings.

Related papers

Reconciling Predictive Multiplicity in Practice [43.74883617124773]
Reconcile is a reconciliation procedure to address the model multiplicity (MM) phenomenon. In this paper, we empirically analyze the Reconcile algorithm using five widely-used fairness datasets. We extend the Reconcile algorithm to the setting of causal inference, given that different competing estimators can again disagree on specific causal average treatment effect (CATE) values.
arXiv Detail & Related papers (2025-01-27T22:48:20Z)
The Cost of Replicability in Active Learning [1.349553980170339]
Active learning aims to reduce the required number of labeled data for machine learning algorithms by selectively querying the labels of initially unlabeled data points. Ensuring the replicability of results, where an algorithm consistently produces the same outcome across different runs, is essential for the reliability of machine learning models. This report investigates the cost of replicability in active learning using the CAL algorithm, a classical disagreement-based active learning method.
arXiv Detail & Related papers (2024-12-12T19:03:31Z)
Replicability in High Dimensional Statistics [18.543059748500358]
We study the computational and statistical cost of replicability for several fundamental high dimensional statistical tasks. Our main contribution establishes a computational and statistical equivalence between optimal replicable algorithms and high dimensional isoperimetrics.
arXiv Detail & Related papers (2024-06-04T00:06:42Z)
The Power of Resets in Online Reinforcement Learning [73.64852266145387]
We explore the power of simulators through online reinforcement learning with local simulator access (or, local planning) We show that MDPs with low coverability can be learned in a sample-efficient fashion with only $Qstar$-realizability. We show that the notorious Exogenous Block MDP problem is tractable under local simulator access.
arXiv Detail & Related papers (2024-04-23T18:09:53Z)
Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. This paper investigates the robustness of existing CLTR models in complex and diverse situations. We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z)
Batch Bayesian Optimization for Replicable Experimental Design [56.64902148159355]
Many real-world design problems evaluate multiple experimental conditions in parallel and replicate each condition multiple times due to large and heteroscedastic observation noise. We propose the Batch Thompson Sampling for Replicable Experimental Design framework, which encompasses three algorithms. We show the effectiveness of our algorithms in two practical real-world applications: precision agriculture and AutoML.
arXiv Detail & Related papers (2023-11-02T12:46:03Z)
Replicability and stability in learning [16.936594801109557]
Impagliazzo, Lei, Pitassi and Sorrell (22) recently initiated the study of replicability in machine learning. We show how to boost any replicable algorithm so that it produces the same output with probability arbitrarily close to 1. We prove that list replicability can be boosted so that it is achieved with probability arbitrarily close to 1.
arXiv Detail & Related papers (2023-04-07T17:52:26Z)
Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization [26.4468964378511]
A replicable algorithm gives the same output with high probability when its randomness is fixed. Using replicable algorithms for data analysis can facilitate the verification of published results. We establish new connections and separations between replicability and standard notions of algorithmic stability.
arXiv Detail & Related papers (2023-03-22T21:35:50Z)
Reproducibility in Learning [8.386806623480156]
A reproducible learning algorithm is resilient to variations in its samples. Despite strong demand, there are efficient reproducible algorithms for several fundamental problems in statistics and learning.
arXiv Detail & Related papers (2022-01-20T19:59:11Z)
MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data. MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism. We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z)
Phase Retrieval using Expectation Consistent Signal Recovery Algorithm based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems. Recent advances in deep learning have opened up a new possibility for robust and fast PR. We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z)
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms. SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
Statistically Guided Divide-and-Conquer for Sparse Factorization of Large Matrix [2.345015036605934]
We formulate the statistical problem as a sparse factor regression and tackle it with a divide-conquer approach. In the first stage division, we consider both latent parallel approaches for simplifying the task into a set of co-parsesparserank estimation (CURE) problems. In the second stage division, we innovate a stagewise learning technique, consisting of a sequence simple incremental paths, to efficiently trace out the whole solution of CURE.
arXiv Detail & Related papers (2020-03-17T19:12:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.