Using Generative Adversarial Nets on Atari Games for Feature Extraction
in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2004.02762v1
- Date: Mon, 6 Apr 2020 15:46:45 GMT
- Title: Using Generative Adversarial Nets on Atari Games for Feature Extraction
in Deep Reinforcement Learning
- Authors: Ayberk Ayd{\i}n and Elif Surer
- Abstract summary: Deep Reinforcement Learning (DRL) has been successfully applied in several research domains such as robot navigation and automated video game playing.
The main reason for this requirement is that sparse and delayed rewards do not provide an effective supervision for representation learning of deep neural networks.
In this study, Proximal Policy Optimization (PPO) algorithm is augmented with Generative Adrial Networks (GANs) to increase the sample efficiency by enforcing the network to learn efficient representations without depending on sparse and delayed rewards as supervision.
- Score: 0.76146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning (DRL) has been successfully applied in several
research domains such as robot navigation and automated video game playing.
However, these methods require excessive computation and interaction with the
environment, so enhancements on sample efficiency are required. The main reason
for this requirement is that sparse and delayed rewards do not provide an
effective supervision for representation learning of deep neural networks. In
this study, Proximal Policy Optimization (PPO) algorithm is augmented with
Generative Adversarial Networks (GANs) to increase the sample efficiency by
enforcing the network to learn efficient representations without depending on
sparse and delayed rewards as supervision. The results show that an increased
performance can be obtained by jointly training a DRL agent with a GAN
discriminator.
----
Derin Pekistirmeli Ogrenme, robot navigasyonu ve otomatiklestirilmis video
oyunu oynama gibi arastirma alanlarinda basariyla uygulanmaktadir. Ancak,
kullanilan yontemler ortam ile fazla miktarda etkilesim ve hesaplama
gerektirmekte ve bu nedenle de ornek verimliligi yonunden iyilestirmelere
ihtiyac duyulmaktadir. Bu gereksinimin en onemli nedeni, gecikmeli ve seyrek
odul sinyallerinin derin yapay sinir aglarinin etkili betimlemeler
ogrenebilmesi icin yeterli bir denetim saglayamamasidir. Bu calismada,
Proksimal Politika Optimizasyonu algoritmasi Uretici Cekismeli Aglar (UCA) ile
desteklenerek derin yapay sinir aglarinin seyrek ve gecikmeli odul sinyallerine
bagimli olmaksizin etkili betimlemeler ogrenmesi tesvik edilmektedir. Elde
edilen sonuclar onerilen algoritmanin ornek verimliliginde artis elde ettigini
gostermektedir.
Related papers
- MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning [17.437573206368494]
Visual deep reinforcement learning (RL) enables robots to acquire skills from visual input for unstructured tasks.
Current algorithms suffer from low sample efficiency, limiting their practical applicability.
We present MENTOR, a method that improves both the architecture and optimization of RL agents.
arXiv Detail & Related papers (2024-10-19T04:31:54Z) - SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning [11.304750795377657]
We propose SHIRE, a framework for encoding human intuition using Probabilistic Graphical Models (PGMs)
SHIRE achieves 25-78% sample efficiency gains across the environments we evaluate at negligible overhead cost.
arXiv Detail & Related papers (2024-09-16T04:46:22Z) - Neural Network Compression for Reinforcement Learning Tasks [1.0124625066746595]
In real applications of Reinforcement Learning (RL), such as robotics, low latency and energy efficient inference is very desired.
The use of sparsity and pruning for optimizing Neural Network inference, and particularly to improve energy and latency efficiency, is a standard technique.
In this work, we perform a systematic investigation of applying these optimization techniques for different RL algorithms in different RL environments, yielding up to a 400-fold reduction in the size of neural networks.
arXiv Detail & Related papers (2024-05-13T13:46:02Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless
Cellular Networks [82.02891936174221]
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach.
In this paper, a novel semantic-aware CDRL method is proposed to enable a group of untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network.
arXiv Detail & Related papers (2021-11-23T18:24:47Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - A Distributed Deep Reinforcement Learning Technique for Application
Placement in Edge and Fog Computing Environments [31.326505188936746]
Several Deep Reinforcement Learning (DRL)-based placement techniques have been proposed in fog/edge computing environments.
We propose an actor-critic-based distributed application placement technique, working based on the IMPortance weighted Actor-Learner Architectures (IMPALA)
arXiv Detail & Related papers (2021-10-24T11:25:03Z) - Deep Reinforcement Learning with Population-Coded Spiking Neural Network
for Continuous Control [0.0]
We propose a population-coded spiking actor network (PopSAN) trained in conjunction with a deep critic network using deep reinforcement learning (DRL)
We deployed the trained PopSAN on Intel's Loihi neuromorphic chip and benchmarked our method against the mainstream DRL algorithms for continuous control.
Our results support the efficiency of neuromorphic controllers and suggest our hybrid RL as an alternative to deep learning, when both energy-efficiency and robustness are important.
arXiv Detail & Related papers (2020-10-19T16:20:45Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z) - Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve
Adversarial Robustness [79.47619798416194]
Learn2Perturb is an end-to-end feature perturbation learning approach for improving the adversarial robustness of deep neural networks.
Inspired by the Expectation-Maximization, an alternating back-propagation training algorithm is introduced to train the network and noise parameters consecutively.
arXiv Detail & Related papers (2020-03-02T18:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.