Related papers: Analysis of Stochastic Processes through Replay Buffers

Analysis of Stochastic Processes through Replay Buffers

URL: http://arxiv.org/abs/2206.12848v1
Date: Sun, 26 Jun 2022 11:20:44 GMT
Title: Analysis of Stochastic Processes through Replay Buffers
Authors: Shirli Di Castro Shashua, Shie Mannor, Dotan Di-Castro
Abstract summary: We analyze a system where a process X is pushed into a replay buffer and then randomly generates a process Y from the replay buffer. Our theoretical analysis sheds light on why replay buffer may be a good de-correlator.
Score: 50.52781475688759
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Replay buffers are a key component in many reinforcement learning schemes. Yet, their theoretical properties are not fully understood. In this paper we analyze a system where a stochastic process X is pushed into a replay buffer and then randomly sampled to generate a stochastic process Y from the replay buffer. We provide an analysis of the properties of the sampled process such as stationarity, Markovity and autocorrelation in terms of the properties of the original process. Our theoretical analysis sheds light on why replay buffer may be a good de-correlator. Our analysis provides theoretical tools for proving the convergence of replay buffer based algorithms which are prevalent in reinforcement learning schemes.

Related papers

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Experience Replay with Random Reshuffling [3.6622737533847936]
In supervised learning, it is common to shuffle the dataset every epoch and consume data sequentially, which is called random reshuffling (RR) We propose sampling methods that extend RR to experience replay, both in uniform and prioritized settings. We evaluate our sampling methods on Atari benchmarks, demonstrating their effectiveness in deep reinforcement learning.
arXiv Detail & Related papers (2025-03-04T04:37:22Z)
Logistic-beta processes for dependent random probabilities with beta marginals [58.91121576998588]
We propose a novel process called the logistic-beta process, whose logistic transformation yields a process with common beta marginals. It can model dependence on both discrete and continuous domains, such as space or time, and has a flexible dependence structure through correlation kernels. We illustrate the benefits through nonparametric binary regression and conditional density estimation examples, both in simulation studies and in a pregnancy outcome application.
arXiv Detail & Related papers (2024-02-10T21:41:32Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy [11.109975137910881]
Class incremental learning aims to solve a problem that arises when continuously adding unseen class instances to an existing model. We introduce an effective buffer training strategy (eBTS) that creates the optimized replay buffer on object detection.
arXiv Detail & Related papers (2023-12-14T17:10:09Z)
Convergence Results For Q-Learning With Experience Replay [51.11953997546418]
We provide a convergence rate guarantee, and discuss how it compares to the convergence of Q-learning depending on important parameters such as the frequency and number of iterations of replay. We also provide theoretical evidence showing when we might expect this to strictly improve performance, by introducing and analyzing a simple class of MDPs.
arXiv Detail & Related papers (2021-12-08T10:22:49Z)
Large Batch Experience Replay [22.473676537463607]
We introduce new theoretical foundations of Prioritized Experience Replay. LaBER is an easy-to-code and efficient method for sampling the replay buffer.
arXiv Detail & Related papers (2021-10-04T15:53:13Z)
Parallel Actors and Learners: A Framework for Generating Scalable RL Implementations [14.432131909590824]
Reinforcement Learning (RL) has achieved significant success in application domains such as robotics, games, health care and others. Current implementations exhibit poor performance due to challenges such as irregular memory accesses and synchronization overheads. We propose a framework for generating scalable reinforcement learning implementations on multicore systems.
arXiv Detail & Related papers (2021-10-03T21:00:53Z)
Distilled Replay: Overcoming Forgetting through Synthetic Samples [11.240947363668242]
Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by keeping a buffer of patterns from previous experience. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by keeping a very small buffer. We show the effectiveness of our Distilled Replay against naive replay, which randomly samples patterns from the dataset, on four popular Continual Learning benchmarks.
arXiv Detail & Related papers (2021-03-29T18:02:05Z)
Neural BRDF Representation and Importance Sampling [79.84316447473873]
We present a compact neural network-based representation of reflectance BRDF data. We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling. We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets.
arXiv Detail & Related papers (2021-02-11T12:00:24Z)
Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z)
Compressing Large Sample Data for Discriminant Analysis [78.12073412066698]
We consider the computational issues due to large sample size within the discriminant analysis framework. We propose a new compression approach for reducing the number of training samples for linear and quadratic discriminant analysis.
arXiv Detail & Related papers (2020-05-08T05:09:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.