Pathfinding in Random Partially Observable Environments with
Vision-Informed Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2209.04801v1
- Date: Sun, 11 Sep 2022 06:32:00 GMT
- Title: Pathfinding in Random Partially Observable Environments with
Vision-Informed Deep Reinforcement Learning
- Authors: Anthony Dowling
- Abstract summary: Deep reinforcement learning is a technique for solving problems in a variety of environments, ranging from Atari video games to stock trading.
This method leverages deep neural network models to make decisions based on observations of a given environment with the goal of maximizing a reward function that can incorporate cost and rewards for reaching goals.
In this work, multiple Deep Q-Network (DQN) agents are trained to operate in a partially observable environment with the goal of reaching a target zone in minimal travel time.
- Score: 1.332560004325655
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep reinforcement learning is a technique for solving problems in a variety
of environments, ranging from Atari video games to stock trading. This method
leverages deep neural network models to make decisions based on observations of
a given environment with the goal of maximizing a reward function that can
incorporate cost and rewards for reaching goals. With the aim of pathfinding,
reward conditions can include reaching a specified target area along with costs
for movement. In this work, multiple Deep Q-Network (DQN) agents are trained to
operate in a partially observable environment with the goal of reaching a
target zone in minimal travel time. The agent operates based on a visual
representation of its surroundings, and thus has a restricted capability to
observe the environment. A comparison between DQN, DQN-GRU, and DQN-LSTM is
performed to examine each models capabilities with two different types of
input. Through this evaluation, it is been shown that with equivalent training
and analogous model architectures, a DQN model is able to outperform its
recurrent counterparts.
Related papers
- Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - CostNet: An End-to-End Framework for Goal-Directed Reinforcement
Learning [9.432068833600884]
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment.
There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines.
This paper introduces a novel reinforcement learning algorithm for predicting the distance between two states in a Markov Decision Process.
arXiv Detail & Related papers (2022-10-03T21:16:14Z) - Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation.
We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional.
We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Focus on Impact: Indoor Exploration with Intrinsic Motivation [45.97756658635314]
In this work, we propose to train a model with a purely intrinsic reward signal to guide exploration.
We include a neural-based density model and replace the traditional count-based regularization with an estimated pseudo-count of previously visited states.
We also show that a robot equipped with the proposed approach seamlessly adapts to point-goal navigation and real-world deployment.
arXiv Detail & Related papers (2021-09-14T18:00:07Z) - An Improved Algorithm of Robot Path Planning in Complex Environment
Based on Double DQN [4.161177874372099]
This paper proposes an improved Double DQN (DDQN) to solve the problem by reference to A* and Rapidly-Exploring Random Tree (RRT)
The simulation experimental results validate the efficiency of the improved DDQN.
arXiv Detail & Related papers (2021-07-23T14:03:04Z) - Learning Long-term Visual Dynamics with Region Proposal Interaction
Networks [75.06423516419862]
We build object representations that can capture inter-object and object-environment interactions over a long-range.
Thanks to the simple yet effective object representation, our approach outperforms prior methods by a significant margin.
arXiv Detail & Related papers (2020-08-05T17:48:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.