Unbiased Deep Reinforcement Learning: A General Training Framework for
Existing and Future Algorithms
- URL: http://arxiv.org/abs/2005.07782v1
- Date: Tue, 12 May 2020 01:51:08 GMT
- Title: Unbiased Deep Reinforcement Learning: A General Training Framework for
Existing and Future Algorithms
- Authors: Huihui Zhang and Wu Huang
- Abstract summary: We propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning.
We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences.
We propose several algorithms embedded with our new framework to deal with typical discrete and continuous scenarios.
- Score: 3.7050607140679026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years deep neural networks have been successfully applied to the
domains of reinforcement learning
\cite{bengio2009learning,krizhevsky2012imagenet,hinton2006reducing}. Deep
reinforcement learning \cite{mnih2015human} is reported to have the advantage
of learning effective policies directly from high-dimensional sensory inputs
over traditional agents. However, within the scope of the literature, there is
no fundamental change or improvement on the existing training framework. Here
we propose a novel training framework that is conceptually comprehensible and
potentially easy to be generalized to all feasible algorithms for reinforcement
learning. We employ Monte-carlo sampling to achieve raw data inputs, and train
them in batch to achieve Markov decision process sequences and synchronously
update the network parameters instead of experience replay. This training
framework proves to optimize the unbiased approximation of loss function whose
estimation exactly matches the real probability distribution data inputs
follow, and thus have overwhelming advantages of sample efficiency and
convergence rate over existing deep reinforcement learning after evaluating it
on both discrete action spaces and continuous control problems. Besides, we
propose several algorithms embedded with our new framework to deal with typical
discrete and continuous scenarios. These algorithms prove to be far more
efficient than their original versions under the framework of deep
reinforcement learning, and provide examples for existing and future algorithms
to generalize to our new framework.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Adaptive Training Distributions with Scalable Online Bilevel
Optimization [26.029033134519604]
Large neural networks pretrained on web-scale corpora are central to modern machine learning.
This work considers modifying the pretraining distribution in the case where one has a small sample of data reflecting the targeted test conditions.
We propose an algorithm motivated by a recent formulation of this setting as an online, bilevel optimization problem.
arXiv Detail & Related papers (2023-11-20T18:01:29Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - DLCFT: Deep Linear Continual Fine-Tuning for General Incremental
Learning [29.80680408934347]
We propose an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation.
Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning.
We show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.
arXiv Detail & Related papers (2022-08-17T06:58:14Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - Incremental Learning via Rate Reduction [26.323357617265163]
Current deep learning architectures suffer from catastrophic forgetting, a failure to retain knowledge of previously learned classes when incrementally trained on new classes.
We propose utilizing an alternative "white box" architecture derived from the principle of rate reduction, where each layer of the network is explicitly computed without back propagation.
Under this paradigm, we demonstrate that, given a pre-trained network and new data classes, our approach can provably construct a new network that emulates joint training with all past and new classes.
arXiv Detail & Related papers (2020-11-30T07:23:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.